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LETTER FROM THE EDITOR 


Business is booming, 
and Xilinx is growing rapidly... 


rogrammable logic continues to grow faster than other segments of the semiconductor market, and 
Xilinx continues to grow along with it—there is no end in sight. To keep up with this unprece- 
dented expansion, we are building several new facilities, acquiring new companies, and incorporating 


the best available complementary technologies. 


We are leading the industry, not only with the most advanced device and software technologies but 
also with the most ambitious plans for future developments. Here are some of our most recent 


activities: 


¢ Xilinx purchases two new buildings in San Jose. The buildings will provide approximately 200,000 
additional square feet and are expected to house up to 700 new Xilinx employees. The purchase of the 
new buildings is the latest in a string of new construction projects we have undertaken in the last few 
years. In 1999, we completed construction of a fourth building at our San Jose headquarters, and other 


projects are also underway at Xilinx locations in Colorado, Ireland, and California. 


Xilinx acquires Visual Software Solutions, Inc. (VSS). Their expertise will help us further extend our 
software leadership and allow us to deliver a variety of customized tools that facilitate HDL-based 
design using our new Virtex-II FPGAs, thus improving your time-to-market. Included in the acqui- 


sition are the VSS HDL Bencher™ and StateCAD™ design tools. 


Xilinx acquires RocketChips, a leading developer of ultra-high-speed CMOS mixed-signal trans- 
ceivers serving the networking, telecommunications, and enterprise storage markets. 
The RocketChips gigabit and multi-gigabit serial CMOS transceiver technologies provide solutions 
for a wide range of serial system architectures, and this technology will be a key feature of our next- 
generation FPGA families. 


¢ Xilinx acquires Tornado, a full-function formal verification application deploying state-of-the-art 
circuit equivalence checking techniques. Based on many years of research and development efforts by 
Veriphia, this new software adds significant value to our advanced development tools. We plan to 
develop this technology even further and focus it on the Virtex FPGA architectures, in alliance with 
key EDA partners. 


Xilinx acquires Integral Design, a privately held design services firm headquartered in Dublin, Ireland. 
The acquisition enhances our professional design services capabilities in the communications and mul- 
timedia market segments. Recent advances in FPGA performance and capabilities continue to drive 
customer needs for additional design resources. Design services enable you to use dedicated designers 
with experience in Xilinx solutions to augment your own internal expertise and improve your time- 


to-market. 


These developments continue to enhance our capability to offer you the best programmable logic 


devices, development tools, and services in the industry. 


Our current capabilities already give you a significant ease-of-use and time-to-market advantage. As 
the market expands, costs decrease, and many new applications become possible, thus fueling even 
more growth. You can see why programmable logic is quickly becoming the technology of choice for 
many more applications, from low-cost consumer devices to high-performance switching systems- 
there simply is no faster or easier way to create the systems of the future. And, Xilinx is well prepared 
to continue leading the way. 
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By: Wim Roelanats, CEO, Xilinx 


The revolution in logic design con- 
rubOLULoKyam o) uboteatetcamele-toet-tulom oYorucoyusnte 
ance improvements and new capa- 
bilities that help you create the sys- 
tems of the future, and get them to 
market faster than ever before. 
FPGAs were once just interconnect 
routing and logic gates; then we 
added dedicated hard cores for 
memory, clock management, and 
I/O. Now, FPGAs are becoming 
the platform on which a combina- 
tion of complex hard cores and 
flexible soft cores combine with an 
abundance of programmable logic 
gates to give you the best possible 
“performance, along with the ease- 


of-use and time-to-market advan- 


~ tages for which FPGAs are well 


~~ “= known. Plus, we can bring you 


oe el 


‘these advantages at a lower cost 


a oo | an: 

tnen 

ibildl. 
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ne ver before. 


~ X#linx has entered a number of 


strategic partnerships and _ has 
acquired key technologies for creat- 
ing the new programmable logic 
platform. Here is an overview of 


our recent activities. 





The IBM Partnership 


Our recent partnership with IBM brings us 
two immediate and dramatic benefits: the 
power of the PowerPC™ hard core, and the 
advanced CMOS manufacturing capability 
of IBM’s state-of-the-art facilities. IBM 
gets intellectual property (IP) from Xilinx 
to help reduce defect densities and improve 
manufacturing productivity. This partner- 
ship has far-reaching implications, and 
gives both companies a significant compet- 


itive advantage. 
The PowerPC Core 
IBM’s PowerPC module and CoreConnect™ 


bus will soon be integrated into Virtex-II 
FPGAs. With this powerful combination, 
you can achieve performance that was 
never possible before, and you can quickly 
develop unique system-level applications 
with greater ease. We found that the 
PowerPC was the most-used processor in 
high-end designs; our communication and 
computing customers use the PowerPC 
because it has good performance, and it has 
a lot of peripherals and other functions that 


make it easy to use. 


Processors like the PowerPC are often used 
as logic engines for low speed, very com- 
plex logic; they allow you to write detailed 
programs that perform intricate condition 
checking and control functions. However, 
because a processor basically executes one 
instruction at a time, it’s slower than actual 


gates which can operate in parallel. 


Now, in one Platform FPGA, you will have 
the best of both worlds; you have the dedi- 
cated PowerPC processor for complex con- 
trol applications, and you have program- 
mable logic gates for very-high-speed data 
paths. The big advantage of having this all 
on one chip is that you can very quickly 
move data from the PowerPC processor to 
on-chip peripherals or custom logic, which 
may be hard cores, soft cores, or unique 
designs created with programmable gates. 
This will give you much higher perform- 
ance than you get using separate chips, 
which must pass signals through their slow- 


er I/O interfaces. 


We are implementing the PowerPC proces- 
sor and other dedicated functions (such as 
memory, clock management, multipliers, 
and I/O interfaces) as hard cores, to give 
you the best possible performance. We will 
compliment these hard cores with over 50 
soft core peripheral functions. By keeping 
most of the peripherals as soft cores, you 
can choose only those functions that you 


need, and create custom designs with ease. 


Advanced CMOS Manufacturing 
Capability 


IBM is one of the most advanced CMOS 
semiconductor companies in the world, 
with device manufacturing technologies 
that are typically a year ahead of most 
other companies. Our partnership with 
IBM gives Xilinx access to this manufac- 
turing technology, and a tremendous 
competitive advantage. Io be competitive 
in our marketplace we have to push man- 
ufacturing technology to the limits. By 
using the most advanced manufacturing 
processes, we can reduce the size and cost 
of transistors, which enables us to contin- 
ue building bigger and bigger FPGAs and 


reduce the costs of existing devices. 


For IBM, FPGAs are the ideal “process 
drivers” to test and refine their advanced 
manufacturing processes. Because FPGAs 
have very regular structures and they allow 
us to address almost every square micron 
of space on a chip, it makes them ideal for 
troubleshooting problem process areas. 
So, Xilinx gets advanced manufacturing 
technology and IBM gets devices that help 
them drive their manufacturing process to 
maturity. Thus, we can achieve better 


yields, faster, and that means lower costs. 


Xilinx has been the leader in developing 
programmable logic technology, and we 
have expanded the market dramatically— 
today, the PLD business is growing 40% 
faster than the regular IC business. Our 
current technologies, bigger densities, 
higher speeds, and lower costs, are expand- 
ing the market much faster than in the 
past; with IBM, we are pushing it even fur- 
ther. 


from the top 


Gigabit Serial 1/0 Capability 


Many new systems today are requiring 
much faster data transfer between systems, 
boards, and devices, due primarily to the 
ever increasing demand for faster networks. 
Very high speed (gigabit per second) serial 
I/O capability promises to solve this diffi- 
cult problem. 


Traditionally, data has been shared by using 
parallel busses such as PCI (Peripheral 
Component Interconnect). However, there 
are inherent limitations with shared busses. 
To increase the speed of a shared bus, you 
can either increase the speed of each wire 
(which is very difficult to do because there 
are many of them), or you can increase the 
number of wires (which takes more and 
more I/O pins). For example, PCI was 
once just 32 bits wide; now you also have 
64-bit PClI—and that’s not enough. The 
problem with this approach to increasing 
bandwidth is that at some point you reach 
a level of decreasing return; the extra pins 
and the need for shared bus protocols lim- 
its the performance and makes it prohibi- 


tively expensive. 
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The best solution we have for this band- 
width bottleneck is to use point-to-point 
connections, over a single pair of wires, 
operating at very high speeds. Currently, 
with this technology, you can achieve a 
data rate of two to three gigabits per sec- 
ond. The big advantage of this method is 
that you use less wires and less power, and 
the total amount of data you can move is in 


fact higher than with a typical parallel bus. 


To create a gigabit serial I/O channel, a 
hard core is needed; you cannot achieve 
these speeds with soft cores in an FPGA. 
The hard core does several functions; it 
receives and transmits the data, and it also 
recovers the clock (because you can recover 
the clock from the data, a single pair of 
wires is all you need for data transfer). The 
hard core must also serialize and de-serial- 
ize the data. By de-serializing the data to a 
16- or 32-bit internal bus, the data speed is 
then reduced by 16 or 32, which an FPGA 


can easily handle. 


With gigabit serial I/O, all the de-serializa- 
tion is done within the chip. When the 


work is done, you can serialize the data 


again and send it out to other 
devices, over a single pair of pins. 
This is a very efficient and low 


cost way to transfer data. 


We recently made _ several 
announcements regarding our 


commitment to gigabit serial I/O. 
The Conexant™ Partnership 


Xilinx recently entered into a 
strategic development and licens- 
ing partnership with Conexant 
Systems, to integrate their 
SkyRail™ 3.125 Gbps serial 
transceiver technology into our 
next generation Virtex-II FPGAs. 
This hard core is the fastest core 
available in CMOS today, and 
will be available in the second 
half of 2001 in a select offering of 
several different Virtex-II devices. 
In our Virtex-II architecture you 
will get more than 20 different 
I/O standards, plus several of 
these gigabit serial I/O channels. 


The high-speed SkyRail trans- 
ceiver is compliant with industry standards 
such as Gigabit Ethernet and Fibre 
Channel in addition to the emerging 10- 
Gigabit Ethernet (IEEE 802.3ae) standard. 
By integrating quad transceivers, which are 
used to create 10-gigabit attachment unit 
(XAUI) interfaces, a single FPGA can 
interface to both 10-Gigabit Ethernet and 
OC-192c. The high-speed transceiver is 
also compliant with the 2.5 Gbps 
InfiniBand™ architecture standard being 
created by the InfiniBand ‘Trade 


Association. 
The RocketChips Acquisition 
High-speed serial I/O capability is so 


important, we decided not to stop at the 
3.125 Gbps speed offered by the Conexant 
core—we are developing the technology fur- 
ther. That’s why we recently acquired a 
company called RocketChips, which is very 
active in creating high speed serial I/O 
cores. RocketChips already has a product 
that is very similar to the Conexant core, 
and they plan to develop even higher speed 
cores operating at 5 to 10 Gbps. 


RocketChips’ gigabit and multi-giga- 
bit serial CMOS transceiver tech- 
nologies provide solutions for a wide 
range of serial system architectures in 
networking, telecommunications, 
and enterprise storage markets. Their 
products include serial backplane 
transceivers (Single and Quad 3.125 
Gbps transceivers), telecom trans- 
ceivers (SONET OC-48 and OC- 
192), enterprise storage transceivers 
(Fibre Channel, Ethernet), and net- 
working transceivers (Gigabit 
Ethernet, 10 Gbps Ethernet, and 
InfiniBand). 


PMC-Sierra Partnership 


Xilinx recently announced the avail- 
ability of POS-PHY™ Level 3 Link 
Layer and Physical Layer cores. These 

cores provide solutions for the 
emerging Packet Over SONET 
(POS-PHY) applications, and both 
cores are compatible with the POS- 
PHY Level 3 interface specified by 

the SATURN® Development Group. With 
these cores, broadband system designers 
can rapidly develop highly functional, scal- 
able, and standards-based equipment to 
increase the speed of networks up to 2.5 
Gbps, and support the exploding growth of 
IP traffic over SONET/SDH backbones. 


Xilinx has also been active in the Optical 
Internetworking Forum (OIF) and the 
ATM Forum to drive POS-PHY Level 4 
acceptance. And, we are the only FPGA 
company to demonstrate over 800 Mbps 
operation, confirming that we can provide 
the full speed capability to support the 10 
Gbps OC-192 draft standard at the OIF 
(OIF2000.088.2). 


Serial Protocol Standards 


To use these high speed serial I/O channels 

effectively, you need well defined protocols 

and networking standards. Xilinx actively 

supports all of the emerging standards, 

including: 

¢ Lightning Data Transport™ (LDT) - A 
chip-to-chip interconnect that provides 


much greater bandwidth per I/O chan- 
nel. It can achieve a bandwidth of 6.4 


THE ROLE OF THE FPbA 1S CHANGING: 
HT 1S BECOMING A PLATFORM ON WhiCh 


PROGRAMMABLE LOGIC, AND HARD CORES 
GIVES YOU THe BEST POSSIBLE DESIGN 


CUSTOM ASIC AND THE TIME-T0-MARKET 


THE COMBINATION OF SOFT CORES, 


SOLUTION-THE SPEED OF A 


ADVANTAGES OF A FLEXIBLE FPGA 


Gbps per eight-wire link width, and can 
support up to 32 links. 


InfiniBand™ - This newly designed inter- 
connect system utilizes a 2.5 Gbps wire 
speed connection with one, four, or 
twelve wire link widths. Promoted by an 
association comprising industry leaders 
such as, Compaq, Dell, HP, IBM, Intel, 
Microsoft, and Sun Microsystems, 
InfiniBand intends to deliver a channel 


based, switched fabric technology. 


XAUI - A quad transceiver utilizing 
3.125 Gbps serial links to create a 10 
gigabit attachment unit interface 
(XAUI). Multiple XAUI interfaces can be 
implemented to allow a single chip to 
interface to both 10 Gigabit Ethernet and 
OC-TIZC. 


Fibre Channel - A high-bandwidth serial 
standard offering 1.06 Gbps data rates 
scalable to 2.12 or 4.24 Gbps. It is capa- 
ble of carrying multiple existing interface 
command sets, including Internet 
PretocoPe (PP! SCSL APL AIP PIrFR and 


audio/video. 


Vee 


¢ Gigabit 10 Gbit 
Ethernet - This includes devices 
compliant with the IEEE 802.3 


alliance. 


Ethernet + 


ATM (OC-12, OC-48, OC-192) - 
This includes support for OC-12 
(622 Mbps), OC-48 (2.4 Gbps), 
and OC-192 (10 Gbps). 


RapidlO™ - A next-generation 
switched-fabric interconnect archi- 
tecture for embedded systems that 
is optimized for both high band- 
width and low latency. Initial 
implementations are expected to 
exceed 1.0 Gbps throughput based 
on clock rates of 250 MHz and 
higher. 


These standards all use the same 
physical interface, so you can use our 
hard I/O cores for all of them. Then, 
we implement the level-2 protocols 
in programmable logic (soft cores), so 
you can quickly create designs using 
any of these standards. This gives you a lot 
of flexibility and it helps you interface 


directly to on-site networks. 
What Does It Mean? 


An FPGA is no longer just gates and rout- 
ing. Over the years we have added more and 
more hard cores, such as memory, clock 
management, and arithmetic functions. 
Now we are driving the technology a major 
step further by adding hard CPU cores and 
high speed serial I/O cores. Combine these 
dramatic technology advances with our 
high performance development tools, our 
unique Internet Reconfigurable Logic capa- 
bility, our extensive training and support 
services, our state-of-the-art manufacturing 
capabilities, and our ongoing partnerships 
with other industry leaders, and you get a 
logic design solution that can breathe life 


into your new designs. 


The role of the FPGA is changing; it is 
becoming a platform on which the combi- 
nation of soft cores, programmable logic, 
and hard cores gives you the best possible 
design solution—the speed of a custom 
ASIC and the time-to-market advantages of 
a flexible FPGA. 





Platform-based Design 
} @ @ 
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As FPGAs move into million-gate densi- 


ties, a world of new possibilities and poten- 


tial applications is opening up for pro- 
grammable logic devices. And along with 
these new opportunities comes many new 
challenges. The pairing of a low-cost, high 


performance PowerPC processor (and 
other hard cores), along with the soft cores 
and programmable logic circuitry in Xilinx 
Virtex-II™ FPGAs means that you will 
now be confronting challenges similar to 


what ASIC designers encountered when 


: ‘ they made the transition to system-on-a- 
The Chiet Technology Officer at chip (S0C) ASICs, 


Everyone who participates in the multimil- 


Synopsys discusses the need tor lion-gate segment of the FPGA market is 


wrestling with the same issues: increased 

: , 1 complexity, escalating development costs, 
platform based design In tne Ci evolving nee ae few ec engi- 
° neers, increasingly compressed design 

ot system-on-c-chip FPGAs. cycles, and so on. In addition, as complex- 
ity increases, the time it takes you to get a 

product to market becomes dominated 

more by your design time than by manu- 

facturing considerations, compromising 


one of the key advantages of FPGAs. 
To help address these challenges, there is 


increasing pressure for designs to share a 
common architecture or platform, especial- 
ly those that are targeted to similar applica- 


tions. A platform is a basic system architec- 
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ture that is geared towards a specific appli- 
cation, such as cell phone base stations or 
set-top boxes, among others; it is cus- 
tomized through software and by adding 


customized logic and IP. 


An FPGA platform enables you to differ- 
entiate your products by adding cus- 
tomized logic and IP using the tightly inte- 
grated FPGA fabric. Platforms are impor- 






Figure I - Platform FPGA: 


Xilinx Virtex-IT FPGA with embedded PowerPC processor 


tant in the era of multimillion-gate FPGAs 
because they enable you to focus on adding 
value through custom IP rather than wast- 
ing time and resources by recreating stan- 


dard components. 
Platform-Based Design 


A central piece of any platform is the 
embedded processor, such as the IBM 
PowerPC processor core in the Xilinx 
Virtex-II platform. A typical platform 
might also include a bus, DSP, input/out- 
put channels, mixed signal functions, 
memory, and some configurable logic such 
as shown in Figure 1. FPGA design thus 
becomes platform design; rather than sim- 
ply designing with gates, you must now 
focus on designing entire systems. 


For you to effectively exploit a platform by 
designing at the system level, four primary 


design considerations must be addressed: 


e Hardware design. 
¢ Software design. 
¢ Integration of hardware, software, and IP. 


¢ Verification of the complete system (on a 


chip). 


Synopsys delivers solutions in all four of 
these areas. The design of hardware has 


been our traditional domain, and we offer 


wgmi 
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Figure 2 - SystemC tools 


a suite of FPGA synthesis tools for this pur- 
pose. FPGA Express™ addresses the push- 
button, fast turn-around market, while 
FPGA Compiler II™ addresses more com- 
plex designs and compatibility with the 
ASIC design flow. Looking further into the 
future, other synthesis technologies such as 
Synopsys Physical Synthesis will enable 
full timing closure for platform FPGAs. 


Open SystemC 


One of the most difficult aspects of soft- 
ware design involves how to interface soft- 
ware effectively with hardware. Open 
SystemC, a set of C++ class libraries that 
enables electronic design at the system 
level, provides an important tool for 
designing software and hardware in a com- 
mon language framework. Based on C and 
C++ (the languages of choice for most algo- 
rithm developers, system architects, and 
software developers) SystemC also includes 
all the language elements necessary to effec- 
tively address hardware design. In this way, 
trade-offs between hardware and software 
can be addressed dynamically, even includ- 


ing reconfiguration in the field. 


SystemC helps you create both systems and 
chips; the suite of tools and methodologies 
Synopsys has developed around SystemC 


significantly accelerate the design of elec- 


tronic systems from concept to implemen- 
tation (see Figure 2). SystemC follows the 
community source-licensing model and 
can be downloaded from the Open 
Initiatives | website at 


SystemC 
www.systemC.org/. 


IP Integration 


One of the advantages of platform-based 
design is that it supports the integration of 
other pieces of proprietary logic and third- 
party IP. In fact, it is the customized por- 
tion of any system-on-a-chip ASIC or plat- 
form FPGA that provides the competitive 
differentiation from one device to the next. 


But with the very large number of gates 
that can now be implemented on a single 
FPGA, the challenge is to become signifi- 
cantly more productive when creating with 
these gates. One obvious solution is to 
leverage existing gates through design 
reuse. Synopsys has been leading the way in 
this area and, with its 
DesignWare® libraries of 
reusable building blocks 
and methodology activi- 
ties, offers several time- 
saving options for 

both ASIC and 

FPGA designers to 

leverage IP from a 
variety of 


sources. 


Figure 3 - Platform verification tools 


System Verification 


The challenge for any system-on-a-chip 
FPGA is to verify the complete system, 
including the processor core, and not just 
the individual blocks that comprise the sys- 
tem. This requires not only a high-speed 
simulator, but also a complete array of 
advanced verification tools. In particular, 
testbench generation, coverage tools, for- 
mal verification, a simulation model of the 
processor and other IP, and static timing 
analysis tools are essential for platform- 


based design (see Figure 3). 


Static timing analysis illustrates the verifi- 
cation challenges imposed by such a sys- 
tem. Synopsys PrimeTime® static timing 


analysis tool can time and analyze a com- 
plete chip, offering the multimillion-gate 
capacity that is required by systems on a 
chip. It also offers analysis modes that han- 
dle the processor core in an effective way. 
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Conclusion 


FPGAs that contain embedded processor 
cores and application-specific components 
are creating a need for platform-based 
design, which requires not only the suite of 
RTL logic design tools that are already in 
use today (with the right capacity), but also 
a comprehensive suite of system-level 
design tools that will be new to most FPGA 


designers. 


FPGA design has moved beyond the era of 
simple logic. With the advent of FPGAs 
that contain an embedded processor core, 
such as the PowerPC, FPGA designers will 
soon join their peers in the ASIC world by 
confronting the challenges of designing 
entire systems. Synopsys is helping you 
meet these challenges through the power of 
its system-level EDA tools optimized for 
platform-based design. 


For more information on all Synopsys 
products, see www.synopsys.com 
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Interring Multiplexers 
in FPGA Compiler I 


and FPGA Express 


How to get better results by automatically interring multiplexers that tully 


utilize architecture-specitic FPGA resources. 


by Alan Ma 


Senior Corporate Applications Engineer, Synopsys, Inc. 
dlanma@synopsys.com 


In general, multiplexers can be implemented 
by using Look Up Tables (LUTs). To obtain 
the best quality of results (QoR), Synopsys 
FPGA Compiler II™ and FPGA Express™ 
(FCH/FE) take it one step further by utiliz- 
ing the built-in multiplexer resources in 
high-density FPGAs, which produces signif- 


icantly better results in both area and speed. 
The Process 


During elaboration, the process of translat- 
ing the text-based description of a design to 
an architecture-independent gate-level repre- 
sentation, FCII/FE infers a generic primitive 
called MUX_OP when it encounters multi- 
plexers in the Hardware Description 
Language (HDL). It is during optimization 
where MUX_OPs are mapped to architecture- 
specific multiplexer resources. The following 
sections describe the requirements for 


MUX_OP to be inferred. 
General Implementation 


Our research indicates that using architec- 


ture-specific multiplexer resources is only 


beneficial when the number of inputs meets 
certain requirements. Table 1 illustrates the 
multiplexer sizes and the primitives FCII/FE 
utilizes for Xilinx Virtex-II, Virtex, and 
XC4000 FPGAs (and their derivatives). 
FCII/FE automatically maps to these hard- 
ware resources (primitives) when you follow 


the recommended coding guidelines. 
Coding Guidelines 
Synopsys recommends the use of CASE 


statements to describe multiplexer logic. 


When the requirements on the number of 


Architecture Min. Inputs 


Virtex-II 


X(4000 


Tm 


Max. Inputs 





inputs for the target architecture are met (as 
shown in Table 1), FCII/FE maps the design 
to architecture-specific multiplexer resources 


if at least 75% of all possible cases are speci- 


fied. 


Figure 1 shows an example of an eight-to- 
one multiplexer in Verilog. Figure 2 illus- 
trates its VHDL equivalent. Note that the 
control signal sel has three bits so there can 
be as many as eight possible cases. As a result, 
at least six (75% of eight) cases need to be 


specified for multiplexers to be inferred. 


Primitives Used 


256 LUT, MUXF5, MUXF6 
256 LUT, MUXF5, MUXF6 
256 FMAP, HMAP 


Table 1 - Multiplexer size requirements for automatic inference 





plications 


Using the infer_mux Directive 


Figure 3 shows a similar eight-to-one 
multiplexer with the addition of sever- 
al arithmetic operators; Figure 4 shows 
its VHDL counterpart. To allow oper- 
ator sharing, multiplexers are generally 
not automatically inferred for CASE 
statements which contain more than 
one operator (regardless of the number 
of cases specified). However, you have 
the option to override FCII/FE by 
using the infer_mux directive. 


The  infer_mux directive forces 
FCH/FE to infer multiplexers as long 


as at least 50% of all possible cases are 


specified. It can be used when: 


¢ The requirements on the number of 
inputs (as shown in Table 1) are not 


met. 


e The CASE statement contains more 


than one arithmetic operator. 


It is important to understand that 
FCH/FE generally makes intelligent 
decisions on multiplexer inference 
based on the cost of doing so. For 
example, it may choose not to infer 
multiplexers, to allow operator sharing 
for better performance. As a result, 
QoR is likely to suffer if you override 
that decision by using infer_mux. 


Please use this directive with caution. 
Conclusion 


FPGA Compiler Il and FPGA Express 
take advantage of Xilinx-specific mul- 
tiplexer resources to deliver the best 
quality of results. The tools automati- 
cally infer multiplexers if the design 
complies with the coding guidelines 
and meets the requirements for the 
target architecture. You also have the 
option to force multiplexer inference 
by using the infer_mux directive. 


Visit the Synopsys FPGA website at 


www.synopsys.com/{pga for other 
information on the latest FPGA 
synthesis technologies. 
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module mux_8to1 ( 
a, b, c, d, e, f, sel, 


mux_out 
I 
input a, b, c, d, e, f; 
input =‘ [2:0] sel; 
output [1:0] mux_ out; 
reg [1:0] mux_out; 


always @(sel or aor b orc ord ore or f) 
case (sel)// synopsys infer_mux 


36000 :mux_out=a+b; 
30001 :mux_out=a+c; 
36010 :mux_out=d-e; 
default :mux_out =d-f; 
endcase 
endmodule 


Figure I - Using CASE statements for 


multiplexers in Verilog 


module mux_8to1 ( 
a, b, c, d, e, f, sel, 


mux_ out 
); 
input a. 0). de; f 
Inout [2:0] sel; 
Output mux_ out; 
reg mux_ out; 


always @(sel or a or b or c or d ore or f) 


case (sel) 
36000 :mux_out =a; 
36001 :mux_out=b; 
3’b010 :mux_out=c; 
3'b011 + :mux_out=d; 
36100 :mux_out=e; 
default :mux_out =f; 

endcase 

endmodule 


Figure 3 - Using infer_mux for multiplexer 


inference in Verilog 


library ieee; 
use ieee.std_logic_1164.all; 


entity mux_8to1 is port ( 
a, b, c, d, e, f: in std_logic; 
sel: in std_logic_vector(2 downto 0); 
mux_out: out std_logic 
); 
end mux_8to1; 


architecture rtl of mux_8to1 is 


begin 

process (sel, a, b, c, d, e, f) 

begin 

case sel is 
when "000" => mux_out <= a; 
when "001" ==> mux_out <= b; 
when "010" => mux_out <=C; 
when "011" = => mux_out <= d; 
when "100" => mux_out <= e; 
when others => mux_out <= f; 

end case; 

end process; 

end rtl; 


Figure 2 - Using CASE statements for 
multiplexers in VHDL 


library ieee; 
use ieee.std_logic_1164.all; 


entity mux_8to1 is port ( 
a, b, c, d, e, f: in std_logic; 
sel: in std_logic_vector(2 downto 0); 
mux_out: out std_logic_vector(1 downto 0) 
); 


end mux_8to1; 


architecture rtl of mux_8to1 is 


begin 

process (sel, a, b, c, d, e, f) 

begin 

case selis = -- Synopsys infer_mux 
when "000" => mux_out <=a+b; 
when "001" => mux_out <=a+C; 
when "010" => mux_out <=d - e; 
when others => mux_out <=d -f: 

end case; 

end process; 

end rtl; 


Figure 4 - Using infer_mux for multiplexer 
inference in VHDL 





by Jim Beneke 
Technical Marketing Manager, Insight Electronics 
jim_beneke@ins.memec.com 


When Xilinx introduced the Spartan™-II 
FPGA family in January 2000, they not only 
offered the lowest cost FPGA devices with 
system-level features, they also enabled pro- 
grammable logic to effectively replace off-the- 
shelf ASSP devices for 32-bit PCI applica- 
tions. Combined with the proven PCI32 
LogiCORE™ interface from Xilinx, the 
Spartan-II PCI solution was the common- 


sense choice for most PCI designs. 


Unfortunately, designers wishing to target 
a Spartan-II device for a PCI project, were 
not able to prototype their design with an 
off-the-shelf PCI platform. Insight 
Electronics recognized the need for this 
type of development board, and intro- 
duced the Spartan-II PCI 
Development Kit. The kit includes a 
Spartan-II prototype board, single- 

use Spartan PCI32 LogiCORE 
license, Windows driver develop- 

ment software, one-day (eight 

hours) of Insight Design Services 
support, reference designs, 
Windows-based applications, 
example Windows 98/NT driv- 
ers, source code, and hardware documenta- 
tion. The demonstration board is based on 
the 150K-gate Spartan-II FPGA, in a 208- 
pin plastic quad flat package (PQFP). 


Implementing the full initiator/target PCI 
interface in the FPGA only consumes about 
ten percent of the logic resources, leaving 
approximately 135K gates for custom user 
back ends. Unlike other PCI prototype cards, 
the Spartan-II PCI board does not contain 
back-end application circuits to complicate 
your custom design. Instead, all user I/Os are 
brought out to expansion connectors for easy 
access and interfacing. This allows your 
designs to be quickly implemented, config- 


Development Tools 


Spartan-ll PC! Development Kit 


Insight Electronics has introduced a Spartan-ll PCI Development Kit 
fo help you jumpstart your next 32-bit PCI design. 


ured, and tested. Figure 1 shows a block dia- 
gram of the Spartan-II PCI card included in 
the kit. 


The New Reference Design Center 


In addition to the Spartan-II FPGA, the 
PCI board also includes the new Xilinx 
XC18VO01 in-system programmable con- 
figuration PROM. This allows PCI appli- 
cation designs to be quickly downloaded 


multiple times to the board and saved in 


non-volatile memory. 





Figure 1 - Spartan-II 
PCT development board 


With this re-configurable feature, Insight is 
including access to its new Reference Design 
Center. At the Reference Design Center, 
owners of the Spartan-II PCI kit can down- 
load pre-configured PCI application designs 
and run them on the demonstration board. 
Developed by Insight Design Services, these 
off-the-shelf application designs can be used 
as is, or can be customized to meet certain 
application needs. In addition to providing 
reference design bit streams and their associ- 
ated source code files, the Reference Design 
Center also provides example Windows driv- 
ers and Windows-based application pro- 


grams. Both drivers and application pro- 


grams are provided with C++ source code so 


you can understand how the examples work. 


To assist in the development and debugging 
of Windows device drivers, the Spartan-I] 
PCI Kit includes Compuware’s NuMega 
driver development software. The NuMega 
package simplifies the task of writing and 
configuring Windows drivers through a series 


of GUI windows. 


The Xilinx Spartan 32/33 PCI Core 
The Insight Spartan-II PCI Development 


Kit includes the new single-use version 
of the Xilinx Spartan-only 32- 
bit, 33 MHz PCI core. The sin- 
gle-use license allows the kit 
owner to support a single produc- 
tion PCI core implementation. If 
multiple PCI core solutions are 
required, then the core license can 
be upgraded to an unlimited 
license for a nominal fee. The 32- 
bit Spartan PCI core is configured 
and downloaded through the 
Xilinx PCI Lounge. The download- 
able core netlist is fully PCI v2.2 
compliant and supports initiator 
and target functions with zero-wait- 


state burst operation. 
Conclusion 


By providing exactly what is needed to com- 
plete a PCI design, the Spartan-I] PCI Kit 
meets the demands of both experienced and 
new designers of programmable logic-based 
PCI interfaces. Several versions of the 
Spartan-II PCI Development Kit are avail- 
able from Insight Electronics. Prices range 
from $145 for a PCI card only kit, to $3,995 
for the Spartan-II PCI 


Development Kit. For more information, go 


complete 


to www.insightelectronics.com/ 
solutions/kits/xilinx/spartan-iipci.html. 


Configurable Processors 


Choosing the AR 
User-Contiqurable 
PrOcessOr 


AR ( ( ( Vil by Emmanuel Benzaquen 
Third Party Program Manager, ARC Cores 
OIES O N | IX eben@arccores.com 
1 , As FPGA i i 
provide wAlLs, rything VOU Wi, pate a a ae 
Virtex product family, it is becoming 


nee 10 develop CUSTOM increasingly practical to implement 


complete systems in a single FPGA. A 


soft processor core represents an attrac- 


DIOCESSOF applications. tive solution for user-configurable 


System-on-Chip (Soc) applications. 
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° An ARC design can be turned 
from VHDL or Verilog into a 
configuration that runs on the 
Xilinx FPGA-based ARCangel 
prototype board in a 
few hours 


° Both software and 
hardware can be 
tested and 
benchmarked 


at the same time 
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The ARC soft processor design and debug cycle 


Processor Cores Complement 
Programmable Logic 


Traditionally, an important motivation for 
adding microprocessors into a design has 
been that software programmable solu- 
tions are easy to change and upgrade. 
Since FPGAs are, by definition, program- 
mable, you can always upgrade them. 
System designers know that it is much eas- 
ier to design and implement certain parts 
of a system using software, while hardware 
implementations offer greater perform- 
ance. For example, you may want to take 
advantage of a large amount of low-cost 
software intellectual property (IP) that is 
available in C or C++ code, for functions 
such as protocol stacks and modem algo- 
rithms. You may also want to implement 
high-speed co-processor functions in hard- 
ware. You can get the best of both worlds 
when you combine the hardware re-pro- 
grammability of FPGAs with the software 


programmability of microprocessors. 


The Hard Versus Soft Option 
Major FPGA vendors typically provide 


two different approaches to including 


in an FPGA. One 


approach offers a soft processor core that is 


processor cores 
provided in a synthesizable HDL format. 
This processor core is then included in a 
generic FPGA using the same design 
process as the rest of the logic. The second 
approach embeds a specific hard processor 
core (such as the PowerPC) into the 
FPGA. The most appropriate choice will 
depend on the application. 


As a general rule, a hard processor core will 
offer higher clock speeds than a soft core. 
However, since the hard processor solution 
will require a specialized FPGA with dedi- 
cated processor buses and routing, it will 
be less flexible than incorporating a soft 
processor in a generic FPGA. In addition 
to performance and flexibility trade offs, 
the choice between a hard or a soft proces- 
sor will also be influenced by the software 


applications you wish to run. 


Since soft processors are 
available as synthesizable 
HDL, they inherently pro- 
vide more design flexibility 
than hard processors because 
you can modify the core 
interface to fit better into a 
specific design. Some soft 
processors provide even 
greater flexibility by being 
configurable. A configurable 
processor core may include a 
graphical tool that enables 
be 


included or excluded with- 


certain functions to 
out having to manually mod- 
ify HDL source code. As a 
result, you can create a 
processor core that is cus- 
tomized for your specific 
application by using the 
GUI. 


The ARC processor is a per- 
fect example of a soft and 
user-configurable core avail- 


able for immediate use in 


Xilinx FPGAs. 
Software Tools 


The most important factor that will influ- 
ence your choice of a soft processor is the 
software tool set that supports the code 
that will run on it. SoC designs can have 
lines of software code running anywhere 
from 1Kbyte to multi-megabytes. For 
applications that have only a few Kbytes 
of code, basic software tools such as an 
assembler may be sufficient. However 
once the amount of code starts increasing 
and becoming more complex, it becomes 
essential to use a high-level language like 


ron ++. 


ARC Cores provide a complete set of 
high-level development tools customized 
for embedded applications, and offers 
both DSP (Digital Signal Processing) and 
general purpose control functions within 
the same processor architecture. There is 
no need to learn two different processor 
architectures and development tool envi- 
ronments. 
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Integrated Software Environment 


Because of the typical complexity of the 
software code, ARC offers the Metaware 
development environment. This profession- 
al set of software development tools 
includes a C/C++ compiler, assembler/link- 
er, and the SeeCode™ source-level debug- 
ger. Most importantly, 
it offers you the ability 
to debug the embedded 
software running on 
the processor in the 
FPGA. It is critical that 
the core and its host 
interface include execu- 
tion control capabilities 
like breakpoint check- 
ing so you can break 
the program execution 
or monitor reads and 


writes to program vari- 


ables. 


As the software content 
of a design increases, 
another important fac- 
tor is the range of appli- 
cations supported and 
the available systems 
software. For example, 
if a design requires sev- 
eral hundred Kbytes of 
code along with standard communications 
software, such as TCP/IP. protocols, you 
can save several months or more of design 
time by purchasing a real-time operating 
system (RTOS) that includes prepackaged 
protocols. ARC supports a large variety of 
commercially distributed RTOS from lead- 
ing vendors and is constantly. increasing 
their ease of integration. 


In addition to the software tools and appli- 
cations described above, another critical fac- 
tor in choosing the ARC core is its level of 
flexibility. Unlike 


processors available today, which sometimes 


other configurable 


require you to manually “hack” the HDL 
code, the ARC processor core enables you 
to easily select special options for configur- 
ing the processor. Hacking the HDL code 
after configuring the processor core might 
break the core, or even make it incompati- 
ble with the software development tools. 
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ARC provides the flexible ARChitect 
Graphic User Interface (GUI) that can be 
used to safely create your custom config- 
ured processor. This is very helpful when 
using a soft processor in an FPGA, and it 
allows you to experiment with different 


options and configurations within minutes. 


"ARC, Third Generation IP" 





Instruction Set Flexibility 


The instruction set is one of the most 
important aspects to consider when choos- 
ing a configurable processor. One poten- 
tial disadvantage of soft processors is that 
they cannot attain the high clock speeds of 
a hard processor. For a conventional 
processor design, the clock speed is essen- 
tially the key determinant of performance. 
The ARC processor changes this equation 
by offering a configurable instruction set 
and the ability to add custom instructions. 
This enables you to accelerate an algo- 
rithm by selecting or adding a few appro- 
priate (but powerful) instructions specifi- 
cally needed for the application that is 
being executed. Thus, you can get the best 
of both RISC (Reduced Instruction Set 
and «= GCISC 


Instruction Set Computer) processor 


Computer) (Complex 


design architectures. This approach pro- 
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vides high performance at lower clock 
speeds, while still maintaining a software 


programmable solution. 


Instruction extensions are available from 
ARC and some third parties. Plug-ins can 
be used and implemented directly in the 
design. For additional capability, you can 


¢ The ARC IP is 
deeply embedded 
with the rest of the 
logic and interface 
directly with other 
customer logic func- 
tion in the Xilinx 


FPGA 


° "Gate-hungry" 
complex system buses 
and associated logic 
are no longer needed 
to reach high-per- 
formance because of 
the tight integration 


also create your own specific instructions. 
Custom instruction extensions offer you a 
particularly powerful way to accelerate 
application performance while retaining 
programmability. Consider the example of 
a DES (Digital Encryption Standard) 
encryption application: by adding special- 
ist bit-permutation, cipher instructions 
and additional registers to hold the keys, it 
is possible to greatly accelerate a range of 


encryption algorithms. 


To provide a truly configurable instruction 
set, it is also important that the number of 
clock cycles for an instruction extension is 
configurable. For example, the ARC 
processor enables the addition of multi- 
cycle instructions to the pipeline where 
desired, and single-cycle operations to pro- 
ceed in parallel with long latency ones. 
This is an advantage over architectures that 
enforce a strict RISC paradigm where 
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every instruction must execute in a single 
cycle. Such restrictions may make it impos- 
sible to add very powerful, complex instruc- 


tions that require multiple cycles to execute. 
Interaction with Other Logic Functions 


The ARC processor can further improve 
performance by enabling tight integration 
between the processor core and other logic 
on the FPGA. Traditional processor cores 
typically communicate with peripheral 
hardware via a system bus. To send data to 
the processor, the peripheral interrupts the 
processor, which then processes the inter- 
rupt using a software routine known as an 
ISR (Interrupt Service Routine). In addi- 
tion to supporting this approach, ARC 
processor enables you to add new core 
extension registers. If desired, the new regis- 
ters can be directly accessed by peripheral 
logic, enabling such devices to communi- 
cate with the processor directly. These alter- 
native approaches can improve performance 
and reduce gate count by eliminating the 
need to duplicate a complex system bus and 
its arbitration logic in an FPGA. 


It is no longer necessary to pass data via a 
bus or to interrupt the processor to have it 
load data from a memory-mapped register. 
Since the special registers are unique to a 
particular piece of peripheral logic, there is 
no need for any decoding or arbitration 
logic. The firmware simply selects the spe- 
cial purpose registers to communicate with 


the peripheral. 


In addition to providing extension registers, 
configurable processors like the ARC core 
can also simplify integration with addition- 
al logic by providing multiple buses. This 
approach enables operations residing on 
separate buses, such as instruction fetches, 
load/stores, and communication with 
peripheral logic. As a result, the bus proto- 
cols of each bus can be relatively simple 
since there is no need to arbitrate between 
multiple devices attempting to control one 
bus. The ARC processor has four buses, 
consisting of instruction and data buses 
(Harvard architecture), a bus directly into 
the processor registers (primarily used for 
debugging), and an auxiliary bus (typically 


used to connect peripheral logic). The aux- 


iliary bus has a very simple interface that 
virtually enables peripherals to be connect- 
ed with just a few wires. This is well suited 
to FPGAs where there is no actual bus, 
allowing peripherals to be efficiently con- 


nected in a point-to-point manner. 
Tool Configurability 
Any processor that offers a high degree of 


configurability must also offer equally con- 
figurable software tools and a debugging 
environment that work in coordination. It 
is of no use to add new instructions to the 
processor if there is no way of telling the 
compiler and assembler about them so that 
actual software programs can take advan- 
tage of them. In a similar vein, the com- 
piler must let you specify which instruc- 
tions will be present in the processor, as 
well as be able to take advantage of features 
such as multipliers or barrel shifters when 
they are included. In fact, software tool 


configurability is one of the greatest chal- 


ARChitect creates a 
HDL descrip- 
tion of the 
CPU 


VHDL, Verilog 








lenge in providing a truly configurable 
processor solution. 


ARC and Xilinx are responding to this 
challenge by offering a complete “plug and 
play” solution to FPGA designers. In addi- 
tion, the ARC tools suite allows you to 
enhance the 


original configurations 


offered in a simple manner. 
Conclusion 


Soft processor cores give you the ability to 
include processors in standard FPGAs. 
Configurable cores can help you achieve 
higher performance at lower clock rates 
through instruction extension and periph- 
eral logic integration. ARC and Xilinx 
offer the perfect combination of a config- 
urable core with powerful extensions and 
third party “plug-ins,” in addition to a 
complete development environment and 
Operating system support, ready to use 


with Xilinx FPGA technology. 


... and a com- 
plete software 
tool chain to 
program it! 





C, C++, ASM, profil- 
er, linker, simulator, 


debugger, ete... 


Tools configurability: ARChitect, the 
ARC Graphic User Interface that make 
it all possible 
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Re-thinking, Your 
Veritication Strategies 


ror. Multimittion 


EP UAS. 


by Thomas D. Tessier 
President, t2design Incorporated 
tomt@hdl-design.com 


FPGA verification is essential for success- 
ful on-time product delivery, and today's 
million-gate FPGAs require you to re- 
think your old verification strategies. 
Many engineers continue to use simula- 
tor-specific approaches for verification; 
the simulation tools are primarily used for 
module testing, while the lab is used for 
system-level integration. This approach 


requires the engineer to manually stimu- 
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late signals and view the resulting wave- 
form responses. Because this process is 
time consuming, error prone, and diffi- 
cult to repeat, engineers often spend min- 
imal time in simulation and quickly move 
to debugging in the lab. Multimillion- 
gate FPGAs implement functions far too 


complex to rely on this ad-hoc method. 


Designers are choosing million-gate 


FPGAs because they are fast enough and 


large enough to handle the design com- 
plexity that was previously achievable only 
with an ASIC. When ASIC engineers 
begin to use high density FPGAs, they 
take their verification approaches with 
them. Those who use a validation process 
with robust tools and a complete self- 
checking testbench environment find that 
continuing to use their familiar testing 
approaches now causes them to loose 
ASIC 


valuable design cycle time. 
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Designers can benefit from a carefully 
defined and executed verification plan 
that takes FPGA reprogramability into 
consideration. Time that was once well 
spent in exhaustive verification at the RTL 
level with an ASIC, now becomes costly 


for a high density FPGA. 
What is Verification? 


Verification is not synonymous with sim- 
ulation. It is a strategy to make sure all 
parts of the system conform to the specifi- 
cation document; simulation is a tool used 
in the verification effort. The basic com- 
ponents of verification are shown in 


Figure 1. 
Specification 


A detailed and complete specification is 
essential for producing working products, 
on schedule. The specification document 
is the foundation of the verification plan, 
and describes the features to be imple- 
mented, under what conditions they 
occur, and what their expected outputs 
should be. This documentation should 
not determine implementation-that is left 


to the experience of the RTL designers. 
Verification Plan 


RTL engineers and verification engineers 
share the responsibility for implementing 
the test plan. The level of test granularity 
(or detail) is outlined at: transactions, pro- 
tocol, interfaces and timing. Essential 
functions are identified. A determination 
of the number of testbenches needed, 
their complexity, and test module depend- 


encies is made. 


Any discrepancies in design implementa- 
tion versus testbench results should be 
referred back to the specification for clari- 
fication. This is not a new concept but 
often overlooked in the rush to produce a 
product. When all elements described 
within the test plan are checked off, the 
verification effort has been completed to 
the required level of confidence. To opti- 


mize your verification effort the following 


list offers examples of the type of informa- 


tion you need to identify: 
e External interfaces 
- Stimulus and response 


- Transaction level, such as Read vs. 


Write operations 
Z Timing requirements 


¢ HDL models available to assist in test- 


bench development 


- Packaged with proposed Intellectual 
Property (IP) 


¢ Tools available to the project 
- Simulators 
- Static Analysis 
- Lab-based tools 


¢ Performance Requirements, such as: 
need 32 block data write @ 66 MHz 
with a latency of less than 300 ns. 


Execution 


A verification strategy that best suits your 
design means breaking out those func- 






















Figure 1 - Verification pyramid 


tions that are essential to simulate and 
those that can be tested during in-system 
test. The execution of the Verification 
Plan requires simulation and in-system 
test on the target PC board-the final 
stages of the pyramid. 


Verification Simulation 
Simulation has two components: 


e Dynamic simulation describes behavy- 


ioral HDL, RTL, and gates. 


e Static analysis encompasses Static 
Timing Analysis (STA), Formal 
Verification and Signal Integrity 
Analysis. 


In-System Test 


During in-system test you have a distinct 
advantage when using FPGAs over ASICs. 
An obvious benefit is the ability to repro- 
gram the FPGA until the desired func- 
tionality is achieved. You also have an 
additional advantage with the Xilinx 
ChipScope Integrated Logic Analyzer 
which enables you to observe internal 
nodes of the chip, on your PC board, 


while running at system speeds. 


Interaction of Verification and Design 
Creation 


Verification has many interactions with 
design creation, as shown in Figure 2. To 
prevent confusion and save time, the 
design and verification teams must work 
from the same thorough specification. In 
addition, the RTL design engineers and 
verification engineers must share the 
responsibility for implementing the test 
plan—testbenches are written to validate 
the design to the specification, not to ver- 


ify the design implementation. 


Once the executable specification (of the 
design) and testbench, both written in 
behavioral HDL, meet the requirements, 
the design is replaced with RTL code. The 
RTL is then verified with the system-level 
testbenches to make sure it meets the writ- 
ten specification conditions. After the 
RTL is validated it is synthesized and 
processed by the Place and Route tools. 
The resulting gates are plugged into the 
system verification testbenches or formal 
verification if it is available. This insures 
the tools have correctly implemented the 
design. In addition generated gates are run 
thorough static timing analysis. This step 
verifies that the system-level timing is 


met. 


System integration is typically referred to 
as “power on’. This is the time when proj- 
ect teams come up with creative answers 
to the question "is it working yet?" 
Projects are ready for in-system test when 
they have validated RTL code, have been 
successfully placed and routed, and can 
create a bit stream to program the FPGA 
on the physical PCB. At this point it is 
expected that module-level partitions have 
been tested for functionality, and that 
module interfaces are stable and well 
defined. The design has been simulated, as 
a chip, at both RTL and gates levels with 
minimum functionality necessary for 
power on. The simulation of the chip is 
often not achieved by FPGA design teams 


still using simulator specific approaches. 
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Figure 2 - Interaction of the verification components 


Conclusion 


A verification strategy combining simula- 
tion, static analysis, and in-system testing 
is key to success with high density FPGAs. 
You are bombarded with many different 
choices for verification of a design; to 
meet time-to-market pressures you need 


to leverage multiple approaches. 


A detailed application note is available to 
guide you through the verification deci- 
sion process, including an in-depth case 
study. It evaluates the design-specific 
trade-offs of choosing functions that are 
essential to simulate and those that can be 
tested during integration. Prepackaged IP 
testbenches are also evaluated for applica- 
bility in the system testbench. The full 
application note can be found at: 


www.hdl-design.com. 
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Foundation ISE — 
What's In a Name? 


Xilinx Integrated Synthesis Environment stirs 
Design Automation Conterence debate. 


by Craig N. Willert 
Software Marketing Manager, Xilinx 
cnwQyilinx.com 


The new 3.1i Foundation™ 
ISE software from Xilinx made its debut at 
this year’s Design Automation Conference 
(DAC), leaving many with the question 
“What should ISE stand for?” Xilinx 
thought the name would speak for 
itself{-Foundation ISE is an Integrated 
Synthesis Environment. But designers view- 
ing the product for the first time at the DAC 
show excitedly came up with other ideas of 


what “ISE” should mean. 


e “I” is for Ingenious, Intelligent, 
Internet-Enabled, Incremental, 
Innovative, Intriguing, Inspiring, 
Inventive, Imaginative, Insightful, 


Intuitive, and Interoperable. 


“S$ 


e “S” is for Simple, Speedy, Sensible, 
State-of-the-art, Smart, Savvy, and Sexy. 


e “E” is for Engineered, Easy, Efficient, 
Empowering (EDA partners), 
Expedient, Easy-to-Use, Extra-Special, 
Essential, and Eloquent. 


What is ISE? 
To understand the basis for the differing 


opinions, it’s necessary to look at the current 


state of the design process. 


Integrated design, synthesis, and implemen- 
tation tools automatically handle all of the 
file dependency issues that any designer 
faces, by answering questions like “What 
tool do I need to run next,” and “Have I re- 
synthesized all of the modified HDL 
blocks?” But time and time again, designers 
are synthesizing their designs with two or 


more synthesis tools—trying to create the 





most optimal design implementation from 
all of the variables. 


To simplify this approach, Xilinx has built- 
in the HDL optimization using Xilinx 
Synthesis Technology and the FPGA 
Express HDL synthesis tools from Synopsis. 
This ensures that every engineer using 
Xilinx Foundation ISE will have access to at 
least two HDL synthesis tools that are high- 
ly compatible and tightly integrated. 


Furthermore, a design “environment” is 
distinguished by its ability to address all 
of your needs as a designer, not just a few 
specific design functions. Foundation ISE 
provides an environment that ensures a 
comprehensive, integrated design flow for 
any programmable logic designer looking 
for an integrated solution that is capable 
of delivering world-class results with 
push-button flows. 


Conclusion 


The Xilinx 3.11 Foundation ISE software 
is already being heralded as the industry’s 
best programmable logic design tool. By 
integrating the HDL design flow, synthe- 
sis, and optimization, Xilinx Foundation 
ISE enables you to spend more time on 
the creative aspects of programmable 
logic design. This helps you focus your 
resources and increase your productivity 
so you can get to market faster and deliv- 
er a more robust product to your cus- 
tomers. Xilinx 3.11 development systems 
deliver superior push-button, interactive, 
state-of-the-art design methods. 


The 3.11 release will begin shipping to all 
registered, in-maintenance customers this 
Spring. To learn more, please visit the 


Xilinx website at: www.xilinx.com. 
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Integrated design flows increase your productivity and..accel 


Oi we 






by Justine Chen 
Product Marketing Manager 
Worldwide Software Mar 

~ justine.chen@yilinx.con 
Karen Fidelak 
Product Marketing Man 
Design Software Division 
karen fidelak@yilinx.con 


‘Teams of software engineers from Synopsys, 
Model ‘Technology, Visual 
Software Solutions, and Xilinx, working in 


Synplicity, 


close collaboration, have created the ultimate 


automation  tools—Xilinx 


Series“ ISE 


Synthesis Environment). The Foundation 


in design 


Foundation (Integrated 


Series ISE software gives you the most 
advanced design automation tools, in a fully 
integrated, fast-working environment that 
increases your productivity and accelerates 


your time to market. 


The Foundation Series ISE software 


includes: 


e Synopsys FPGA Express - HDL synthesis 
software. 

¢ Synplicity Synplify - HDL synthesis soft- 
ware. 


¢ Model Technology ModelSim - HDL sim- 


ulator. 


e Visual Software Solutions HDL Bencher - 


Automatic testbench generation tools. 


e Visual Software Solutions StateCAD- 


Automatic State machine generation tools. 


¢ Xilinx XST synthesis technology - For fur- 


ther optimization. 


¢ Xilinx implementation tools - For opti- 
mum use of device resources and the fastest 


place and route times in the industry. 
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The Keys to Increased Productivity 


In the past, most large digital design compa- 
nies relied on individual point tools, and 
were less concerned with managing the flow 
of data between the tools. Solving the prob- 
lem of connecting point tools came later, 
and required customized design flows. This 
need to connect data flows between various 
point tools led to development of standard 
information exchange interfaces, such as 
HDL. But HDLs, including Verilog and 
VHDL, though useful as industry standards 
for hardware design, did not deliver a com- 
plete solution. For example, various simula- 
tion and synthesis tools might interpret and 
optimize differently, and produce undesir- 


able results. 


Today, there’s a new focus. As more and 
more competing companies address the 
problem of designing a “system on a chip,” 
they see more value in integrated tools that 
work together seamlessly, than in individual 
point tools, because tool integration is the 


key to increased productivity. 
Integrated Design Flow Management 


Today, you need fast, reliable flows of design 
information between tools. And, you want 
to specify common information, just once, 
for multiple tools; this includes the location 


of simulation libraries, macro libraries, and 


2.your time om 


timing information. Though a homegrown, 
customized process for specification of com- 
mon information can often be automated, 
updating a single point tool within a flow 
usually calls for a complete rewrite of setup 
information. And using various point tools 
within a design flow often requires creation 
of additional design data files. That addition- 
al design work and processing decreases your 


productivity, and slows time to market. 


The Foundation Series ISE software auto- 
matically communicates common informa- 
tion to each tool and eliminates the need to 
create data file overhead. Unlike homegrown 
flow automation, an integrated design tool 
suite is aware of downstream tool require- 
ments. For example, when you want to per- 
form timing simulation after place and route, 
an integrated tool suite can instruct its place 
and route tools to produce the timing simu- 
lation netlist, so it can be read by the simula- 
tor. Today, winners in the race to market are 
focusing on design automation tools that are 


integrated (see Figure 1). 
Integrated Project Management 


Given the large number of source files, con- 
trol files, and implementation files generated 
by today’s complex, time-pressed design 
projects, it is not merely desirable, but neces- 
sary, to have an automated, integrated soft- 
ware tool that can manage project files. For 
example, a design project may consist of 
HDL files, IP cores, netlists, user constraints, 
or any combination of these. You know it 


can be difficult to manage the project when 


nT ilicotions TTS 


one, or more of these design modules are 


modified. 


The Foundation Series ISE software will 
manage all modules in the design for you. 
For example, it knows about all of the HDL 
code in your design, and it knows when the 
code has changed; therefore it will 
know, and can tell you, when 
HDL-generated netlists must be 
updated, and processes re-run. 
Then it will clearly display all 
design sources and implementation 
results, and provide easy access to 
the appropriate editing tool for 


every source file. 


Many HDL compilers, as well as 
schematic entry tools, require that 
you specify a device family library 
up front, to provide appropriate 
library symbols and components 
for a given architecture. “on 
Additionally, if your design is retar- 

geted to a new device architecture in the 
middle of your design project, then you must 
change the project libraries to match the new 
architecture. The Foundation Series ISE soft- 
ware makes the changes for you. Youre left 
with nothing to do but select 


the device family, once. Your 













selection will set the appropri- 
ate device libraries for design 
entry. And automatically pass 
device information forward to 


place and route tools. 


In the course of a design 
cycle, its highly likely a 
design will be implemented 
many times. For example, 
revisions may be made to 
timing constraints, target 
device, and place and route 
options, in pursuit of the best 
overall design implementa- 
tion. The Foundation Series ISE software 
provides revision control by archiving each 
implementation, along with all design flow 
control files and design constraint files, for 
future reference or use. With this informa- 
tion, you can consult or deploy an archived 
implementation anytime, without recompil- 


ing your entire design (see Figure 2). 


Figure 2 - Foundation Series ISE 


project snapshots for effective project management 


Integrated Environment 


for Design Optimization 


You usually have some overall design strategy 
that you are looking to optimize in your 
design flow. For example, your strategy may 


place highest priority on fitting the design in 





Figure 1 - Foundation Series ISE —well-integrated HDL solu- 


the smallest possible device, or on getting the 
fastest performance. A synthesis tool can be 
used to optimize the design’s performance 
based on timing requirements, but for the 
best results, the place and route tools 
must then receive 
the same informa- 
tion to complete 
the design. This 
can mean setting 
requirements 


twice. However, 


with the Foundation Series ISE software, you 
only have to define the settings once, so you 
can optimize your design strategy faster and 


more reliably. 


The Foundation Series ISE software ensures 
that the software tools work well together; 
the tools must communicate with each other 


to efficiently transfer design data automati- 
cally. What's more, front to back design flow 
strategies are used, enabling the individual 
tool’s features to be leveraged to their greatest 
benefits. In a non-integrated environment 
these communications tasks and decisions 


are left to you. 


Integrated Environment 


for Collaboration 


To facilitate the efficient flow of design data 
constraints and strategies, it is far more effi- 
cient if teams of software developers work in 
collaboration. An integrated environment 
makes possible, and enhances, collaborative 
work, which is critical during the project 
development phase. However, collaboration 


presents a new challenge. 


Designers, working with an integrated tool, 
in an integrated environment, depend on 
software quality. When your in-house 
designers collaborate with third party part- 
ners for example, and use different tools, 
interoperability problems may occur; you 
can only hope solutions are available from 


each tool’s vendor. 


When you use the Foundation Series ISE 
software, you are assured of software quality 
because it has been tested thoroughly for tool 
interoperability, across the project creation 


lifecycle. 
Conclusion 


Foundation Series ISE 
provides you with a com- 
plete HDL design 
environment. 


Now you can 





manage and opti- 
mize your design projects, and your 
engineers can work collaboratively, 
with confidence in Xilinx quality 
and technical support. 


Learn more about how Xilinx 
Foundation Series ISE meets your require- 
ments for integrated design automation. See 
and hear the Xilinx internet presentation, 
“Xilinx Foundation Series ISE: Delivering 
the Benefits of HDL _ Design to 
Programmable Logic Designers,” 


by going to www.netseminar.com/tbd/tbd. 
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New Products 


Software 


stateCAD Xt 
Optimizing 


State Mac 


INC. . 


w you can implement fast , mor 
state machines, with ease. = 


by Andy Bloom 

Director of Engineering, Visual Software Solutions 
(into@statecad.com) 

ambloom@testbench.com 


Ricky Escoto 

Director of Marketing, Visual Software Solutions 
(into@statecad.com) 

rescoto@testbench.com 


Control logic is usually implemented as 
finite state machines (FSMs), which usual- 
ly require you to work through multiple 
levels of design and optimization, often 
within tight development schedules. And, 
as designs grow larger, the complexity of 
implementing control logic increases cor- 
respondingly, forcing you to migrate from 
schematics to hardware description lan- 
guages (HDLs). State CAD® XE automates 
the state machine development process, 


saving you a lot of time and trouble. 
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Manual FSM Design 


Until recently, you had to specify control 
logic manually; you had to draw state dia- 
grams by hand (or with a graphics pack- 
age), and then manually translate them to 
schematics or to an HDL. Timing and 
logic problems identified during simula- 
tion resulted in modifications to the orig- 
inal design, which then needed to be re- 
verified, step-by-step. 


This approach tends to be slow, repetitive, 
and error-prone. Translation errors invari- 
ably creep in and require substantial effort 


to eliminate. 


Hardware Description Languages (HDLs) 
allow more logic to be specified and main- 
tained with less effort, and they can be 
synthesized in numerous ways. You can 


control how synthesis operates, allowing 


you to create your design in the manner 


best suited to your target application. 


The way an HDL is structured dramati- 
cally impacts the speed, area, and power 
consumption of the synthesized device. 
When doing finite state machine design, 
the best results can only be achieved by 
careful consideration of the resources 
available, and by having the flexibility to 


experiment with different alternatives. 
Automated FSM Design Using StateCAD XE 


A quicker way to implement state 
machines optimized for Xilinx devices is 
to use the Xilinx ISE software, which 
includes StateCAD XE. This tool allows 
you to draw complex state diagrams, 
choose design specific optimizations, and 
generate synthesizable VHDL, Verilog, or 
Abel-HDL. StateCAD allows you to 
change optimizations (including state 
assignment mode, registering output, and 
signal loading), then reproduce the HDL 


automatically. 


One advantage of automatic state machine 
translation is the ability to change opti- 
mizations and regenerate code in seconds. 
By trying different code styles, state 
assignment modes, and optimizations, you 
can find which combination yields the 


optimal solution for your design. 
State Machine Example 


By comparing implementations of a sim- 
ple state machine, we can see the impact 
on state machine design. The small state 
machine in Figure 1 will be implemented 
with both registered and combinatorial 
outputs, illustrating the impact of output 


optimization on implementation: 





Figure I - Example state machine 
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Output Optimization 


Outputs can be optimized for 
speed (registered) or for area 
(combinatorial decode). Com- 
binatorial decoded outputs 
become active by decoding 
state registers (Moore) or by 
decoding state registers and 
inputs (Mealy). Registered 
outputs are calculated prior to 
the active edge of the clock, 
and typically improve speed 
because a level of propagation 
delay is removed, but usually 
require more area than combi- 
natorial implementations. 
Registered outputs are insensi- 
tive to input glitches or to 
multiple state bit changes. 


Design Results 


In Table 1 you can see the reg- 
istered design has outputs that 
change at the same time as the 
stable 
between clocks. The output 


state bits, and are 


delay time is the clock to out- 


REGISTERED OUTPUTS 


PROCESS (sreg, RESET) BEGIN 
next_EVEN <= ‘0’; next_sreg<=S0; 
IF RESE I= 1) WHEN 

next_sreg<=S0; next_EVEN<='1’; 
Else 
CASE sreg |S 


WHEN SO => 


next_sreg<=S1; 


WHEN $1 


=> next_sreg<=S2; next_EVEN<='1’; 


WHEN S82 => next_sreg<=S0; 


next_EVEN<='1; 

END CASE; 
END IF; 

END PROCESS; 


o Fig Flops. 4 AND Gans, 1 Or Gate 
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COMBINATORIAL OUTPUTS 


PROCESS (sreg, RESET) BEGIN 


EVEN <= ‘0’; next_sreg<=S0; 
ORE Se iene 
next_sreg<=S0; EVEN<='1’; 
ELSE 
CASE sreg IS 
WHEN SO => next_sreg<=S1; 


EVENc=ie 


WHEN $1 => next_sreg<=S2; 
WHEN S2 => next_sreg<=S0; 


EVEN<= 1% 


END CASE; 
END IF; 
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put delay of the register. All 
decoding necessary for the 
before the 


clock, at the same time as the 


output occurs 
decoding for the next state. The decode 
time is effectively “buried” in the state 


decode time, producing a faster design. 


In comparison, the combinatorial design 
requires time to decode the state bits, yield- 
ing a slower implementation. The advan- 
tage for the combinatorial design is the 
smaller area: 5 logic elements compared to 


8 for the registered design. 


Additional StateCAD Benefits 
StateCAD provides additional benefits to 


Xilinx customers: 


¢By automating the complete state 
machine development process, the Xilinx 
ISE software and StateCAD eliminate 
manual coding, translation errors, stale 


documentation, and logic bugs. 


e StateCAD includes wizards tailored for 


designing concurrent state machines 


Table 1 - Comparison of output styles 


and associated logic. State diagrams can 
include states, transitions, Mealy and 
Moore outputs, resets, counters, 
shifters, multiplexers, and much more. 


No HDL knowledge is required to spec- 


ify control flow. 


StateCAD exhaustively analyzes state dia- 
grams for inconsistencies, automatically 
identifying more than 200 problems, 
such as stuck-at-states, conflicting out- 


puts, and non-deterministic control flow. 


StateCAD includes a built-in simulator 
called StateBench, for behavioral verifica- 
tion and identification of problems at the 


state diagram level. 


StateCAD automatically translates state 
diagrams to synthesizable VHDL and 
Verilog. Optimizations include one-hot 
state assignment, registered outputs, and 


prioritized transitions. 


¢ StateCAD is fully integrated within the 
Xilinx ISE software, and produces HDL 
optimized for Xilinx devices, guarantee- 


ing you the best possible results. 


¢ StateCAD can import FSMs created with 
of the Xilinx 


Foundation Series software. 


previous releases 


Conclusion 


Using StateCAD XE you can quickly 
implement state machines optimized for 
Xilinx devices. As design parameters 
change, just select a new set of optimiza- 
tions, then regenerate code suited for the 


new requirements. 


StateCAD XE is available at no charge 
to Xilinx customers, and is included 
with the Xilinx ISE software or can be 
downloaded from Wwww.xilinx.com 
(download StateCAD from the WebPack 
BackPack section). 
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HDL Bencher XE 
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Now you can develop complete, timing constrained VHDL 
and Verilog testbenches in minutes. 


by Andy Bloom 

Director of Engineering, Visual Software Solutions 
(into@testbench.com) 
ambloom@testbench.com 


Ricky Escoto 

Director of Marketing, Visual Software Solutions 
(info@testbench.com) 

rescoto@testbench.com 


Validating FPGAs can require substantial 
effort, unless you have a high order software 
tool like HDL Bencher XE. The usual 
process requires you to write many test- 
benches, simulate them, check the results, 
and log all failures. To adequately test a 
design involves verifying all the possible cycle 
types available to the device, and may require 
several hundred test cases. And, as your 
design is revised, port definitions may 
change, making your existing testbenches 
obsolete, which results in unnecessary effort 
to update the HDL source code and the 


accompanying testbench. 


To simplify FPGA and CPLD testing, Xilinx 
now includes Visual Software Solutions’ 
HDL Bencher XE in the Foundation ISE 
and WebPack ISE design tools. No knowl- 
edge of HDL or scripting is required. 
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HDL Bencher Overview 


HDL Bencher accepts any HDL design, and 
then lets you select the unit under test 
(UUT), specify stimulus and response 
(using the pattern wizard and the 
WaveTable™ spreadsheet-based interface), 
and then export a complete, self-checking 
testbench automatically. If your HDL 
source contains external dependencies, 
HDL Bencher prompts you to compile 
them locally so that the whole design can be 


simulated. 


HDL Bencher lets you manipulate wave- 
forms in the same way you manipulate a 
spreadsheet. You can cut, insert, and paste 
rows (signals) or columns (time regions) 
with ease, and HDL Bencher automatically 


readjusts timing. 
Interactive Simulation 


The testbenches are automatically updated 
when the HDL source changes, eliminating 
stale test cases. To facilitate design retarget- 
ing, HDL Bencher allows testbenches to be 
moved between VHDL and Verilog with 
one simple command. When compilation 


errors are found during simulation, HDL 


Bencher links the error reported to the 
offending line in the HDL source. 


Create Self-Checking Testbenches 


The testbenches include component instan- 
tiations, generic specifications, stimulus, out- 
put check procedures, and assertions. You 
can create “golden models” for regression 
testing and future design validation; mis- 
matches in expected and actual output values 
are flagged automatically. All the necessary 
timing constraints are faithfully represented 


in the resulting testbench. 
Verify Timing 


By adding timing constraints, you can gener- 
ate VHDL or Verilog testbenches for post- 
synthesis verification. Synthesized netlists 
differ from behavioral HDL because data 
types are remapped, I/O modes are changed, 
unused signals are dropped, and generics are 
flattened. HDL Bencher automatically re- 
maps behavioral testbenches to simulate with 


synthesized netlists. 
Demonstration Design 


As an example, the following HDL code is 
used as input into HDL Bencher: 


library IEEE; 
use IEEE.std_logic_1164.all; 
use IEEE.std_logic_unsigned.all; 
entity counter is 
Port ( 
CLK,RESET,CE : in std_logic; 
T : out std_logic; 
COUNT : inout integer range 0 to 7 := 0); 
end counter; 
architecture behavioral of counter is 
begin 
process (CLK, RESET, CE, COUNT) 
begin 
if RESET='1’ then 
COUNT <= 0; 
elsif CLK='1’ and CLK’event then 
if CE='1’ then COUNT <= COUNT + 1; 
else COUNT <= COUNT; 
end if; 
end if; 
if COUNT=1 then T<='1’; else T<='0’; end if: 
end process; 
end behavioral; 


Cc) Sofware 


Within the Xilinx ISE software, you start 
by selecting the HDL file from the source 
window, then you choose HDL Bencher 


from the process window. The design is 








Figure 2 - Stimulus for the design example 
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Figure 3 - ModelSim running the testbench and design 


automatically imported, and you are given 
the opportunity to select worst-case global 


timing parameters: 


A waveform is created next (Figure 1), which 
includes all the signals for the unit under test 
(UUT). Individual waveforms are then mod- 
ified directly on the screen by clicking on the 
signals to show the expected behavior, or by 


using the built-in pattern generator. 


Next, HDL Bencher automatically exports a 
self-checking testbench. The testbench 
includes all stimulus (Figure 2), output asser- 
tions, timing constraints, and check routines 
needed to verify the operation of the design. 
The testbench is added to the ISE project, 
then auto-simulated through the Xilinx ISE 
software and ModelSim (Figure 3). 


An advanced version of HDL Bencher is 


now available which automatically back- 
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annotates the expected response into the 
waveform. If no expected response was spec- 
ified, HDL Bencher back annotates the 
response obtained by ModelSim. Otherwise, 


expected and actual respons- 
es are compared, and dis- 


crepancies are highlighted. 





Once your design is synthe- 






sized, its behavioral test- 






bench may be incompatible 
with the resulting VHDL 
netlist generated during the 
post-route process. In this 
case, the resulting netlist 
uses std_logic_vector instead 
of integers. To make the 
synthesized netlist simulate, 
you would switch back to 
HDL Bencher, re-associate 
the 


and 


the waveform with 
synthesized netlist, 
re-export the  testbench. 
Finally you would switch 
back to the ISE software 


and re-simulate. 
The Resulting Testbench 


The exported testbench in 
this example is 183 lines of 
code, and took under 1 
minute to create and simu- 
late. The following portions 
of the testbench highlight 
some of the aspects of automatic testbench 


generation: 


Automatically Commented 
— VHDL TestBench created by 
— Visual Software Solution’s HDL 
Bencher 2.00 
Libraries Extracted 
LIBRARY IEEE; 
USE IEEE.std_logic_1164.all; 
Log File Created 
mee meoUlls: (EXT IS OUT 
“results. txt’; 
Components Instantiated 
COMPONENT counter 
PORT ( 
CLK : in std_logic; 


COUNT : inout integer RANGE 0 TO 7 
Test Signals Defined 
SIGNAL CLK : std_logic; 


SIGNAL COUNT : integer RANGE 0 TO 7; 
Instantiates Unit Under Test 
UUT : counter PORT MAP ( 
GEEK == CLK, 


COUNT => COUNT 
Clock Process Created 
BEGIN 
CLOCK_LOOP : LOOP 
CLK <= transport ‘0’; 
WAIT FOR 10 ns; 
CLK <= transport ‘1’; 
Creates Check Procedures 
PROCEDURE CHECK_COUNT( 
NEXT COUNT : INTEGER 
Reports Errors In Expected Values 
IF (COUNT /= NEXT_COUNT) THEN 
write(TX_LOC, string (“Error at 
time=")); 
Applies Inout Stimulus 
RESET <= transport ‘1’; 
CE <= transport ‘0’; 
Validates Timing 
WAIT FOR 100 ns; — Time=820 ns 
Verifies Outputs 
CHECK_COUNT(7,820); — 7 
Reports Success/Failure 
ASSERT (FALSE) REPORT 
“Simulation successful. No prob- 
lems detected. “ 
Draw Expected Behavior 


Conclusion 


With HDL Bencher you can verify the 
operation of VHDL and Verilog designs in 
minutes; no HDL scripting is needed. The 
resulting testbenches are self-checking, and 
are compatible with the Xilinx ISE soft- 
ware. HDL Bencher XE is available at no 
charge to all Xilinx customers, and is 
included with the ISE software or can be 
downloaded from  www.xilinx.com 
(download the HDL Bencher “BackPack” 
from the WebPack section). 
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New Products 


by Karen Fidelak 


Technical Marketing Engineer, Xilinx 
karen. fidelak@xilinx.com 


Incremental design changes (due to 
ECOs, 


repeated design iterations) can cause sig- 


specification changes, and 
nificant delays if you have to synthesize 
and place and route your entire design 
after each change. Ideally, your synthesis 
and place-and-route software tools 
should recognize where changes have 
been made in your overall design and 
recompile just those portions that have 
changed. That’s what you get with BLIS, 
a unique synthesis and place-and-route 
capability, developed by Synopsys for 
Xilinx, that provides a guided synthesis 


methodology. Used in conjunction with 
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Xilinx High-Level Floorplanning, BLIS 
provides the most robust incremental 


design capability ever offered. 


BLIS, a part of the Synopsys FPGA 
Express/FPGA Compiler II v3.4 software 
(FE/FCII), is now available in the Xilinx 
ISE 3.2i development tools. 














Figure 1 - Constraint Editor, specifying Block Roots 


ys Gui 
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With Block Level Incremental 
Synthesis (BLIS), your design 
implementation times will 
improve dramatically. 


Block Level Incremental Synthesis 


As you make design changes, BLIS recog- 
nizes “blocks” of the design which have 
been changed at the source, and intelligent- 
ly synthesizes only those portions of the 
design. In this flow, a block is defined as a 
module/entity and any 
hierarchy tree beneath it. 
To enable BLIS, you 
choose blocks in your 
design that you want to 
denote as “Block Roots” 
through the FE/FCII 
Constraint Editor GUI or 
scripting language, as 


shown in Figure 1. 


CoG Sofware 


A Block Root is a block which is intelli- 
gently updated by FE/FCII in incremen- 
tal synthesis runs, and has the following 


characteristics: 


¢ A separate netlist is created by FE/FCH for 
each Block Root. 


¢ Only those Block Roots whose correspon- 
ding source has been modified are re-syn- 
thesized. 


¢The Block Root has hard boundaries 
around it—no optimization occurs with 


neighboring modules. 
The Advantages of BLIS 


There are two main advantages to using this 


type of incremental flow. 


¢ Runtime for both synthesis and place-and- 
route will be improved because only the 
modified portion of your design will be re- 
synthesized and re-netlisted. The remain- 
der of the design will remain unchanged 
and the netlists for the unchanged portions 
of the design will not be rewritten. Because 
the netlists of the unchanged portions of 
the design remain untouched, you are 
assured that all net and instance names in 
that part of your design are identical to ear- 


lier runs. 


¢ Timing predictability will be improved 
because the “Guide” function of the place- 
and-route tools, which relies on matched 
component names from run to run, will 


have a higher success rate. 
Benchmarks 


We compared the results of incremental 
design flows using BLIS against the more tra- 
ditional methodology of re-synthesizing and 
re-routing the entire design. With the BLIS 
flow, incremental changes are made to a small 
number of design blocks (Block Roots). With 
the traditional flow incremental changes are 
made to the same design blocks, however 
they are not specified as Block Roots. 


After our example design synthesis was com- 
pleted, the design was placed and routed 
using the Guide feature of the Xilinx imple- 
mentation tools, which allow you to specify 
an existing placed-and-routed design to be 


used as a “Guide” when implementing a 
p g 


design. The existing placed-and-routed design 
was used as a template when re-implement- 
ing the design. Any portions of the design 
which existed in both the “Guide” design and 
the new modified design (determined by 
matching net and component names) were 
placed in the same location in the new imple- 
mentation as they were in the “Guide” 
design. New or changed logic was imple- 


mented around existing, “Guided” logic. 
Runtime Improvements 


Runtime improvements of up to 50% 
(with an average of 47%) were observed 


when using BLIS with Xilinx Guided Place- 


and-Route in an incremental design flow; 


Runtime Reductions 


% Runtime Reduction 





Without BLIS With BLIS 


Figure 2 - BLIS runtime reductions 


Design Efficiency 
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Figure 3 - BLIS design efficiency 


With BLIS 


Figure 2 shows averaged design results. 
Because FE/FCII does not re-elaborate or 
re-optimize unchanged blocks of the 


design, synthesis runtime was reduced. 


Implementation runtime was improved 


due to increased design component match- 


ing during guided placement and increased 


signal matching during routing. 
Additionally, the synthesis tool does not 
rewrite the EDIF netlists for the unchanged 
blocks, further reducing runtime, because 


no file re-translation is needed. 
Guide Improvements 


When a design is placed-and-routed using 
the Guide feature, the success of the Guide 
can be determined by the “Design 
Components Matched” statistics available 
in the Place-and-Route report. The higher 
the percentage of matched components, 
the closer the incremental design is to the 
original results, leading to better pre- 


dictability of timing and placement results. 


When using the BLIS incremental design 
flow, Guide success rates reached levels of 
at least 95%, and averaged 97%. When 
BLIS was not used to guide the design, 
component and route matching was as low 


as 52%, as shown in Figure 3. 


The improvements when using BLIS can 
be attributed to the increase in net and 
component name matches between the 
original placed-and-routed design and the 
incrementally modified version of the 
design. Because unchanged blocks of the 
design are not re-synthesized, the netlists 
are untouched and thus remain identical 
to the original version. (Even if there 
are no logic changes in the source, re-syn- 
thesizing a block can lead to net and com- 
ponent names being changed in the final 
netlist.) 


Conclusion 


When utilizing FE/FCII Block Level 
Incremental Synthesis in a Xilinx guided 
design, runtimes as well as timing and 
placement consistency exhibit significant 
improvements over a more traditional 
design flow. These enhancements help you 
achieve a higher level of productivity by 
allowing you to synthesize and implement 
incremental design changes, with a signif- 
icantly reduced runtime, while preserving 
the unchanged portions of your design. 
This new design flexibility allows you to 
realize the productivity necessary to com- 


plete large or small FPGA designs faster. 
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New Products PROMs 


New High-Density Virtex 
PROMs and Cost-Effective 
Spartan-ll PROMs 
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Xilinx announces the addition 

of the XC17VO0 and XC17SO0A 
families to its existing line of one- 
time programmable (OTP) PROMs. 





by Theresa Vu 
Product Marketing Engineer, Xilinx Inc. 
theresa.vu@xilinx.com 


All Xilinx PROM families are designed 
specifically for use with Xilinx FPGAs, there- 
fore we offer a complete, pre-engineered, 
drop-in configuration solution that works 
perfectly the first time; and you are spared 
the time-consuming task of designing your 
own. We recently introduced two new fami- 
lies, one for our Virtex FPGAs and one for 


our Spartan FPGAs 





New Products OE 


Virtex Configuration PROMs 


wiReTeE x Our low-cost XC17V00 
PROMs support Virtex 
and Virtex-E FPGAs, 
up to 3.2 million system gates, and are 
offered in 1-Mb to 16-Mb densities. The 


available packages are shown in Table 1. 





The 16-Mb 17V16 PROM, a four-fold 
increase in maximum bit density, extends the 
Xilinx leadership in configuration memories 
and provides a one-chip configuration solu- 


tion for our entire line of Virtex FPGAs. 


Key Features 


The XC17V00  serial/parallel PROM 
family is based on our proven, OTP archi- 
tecture that provides a stable, low-cost, 
highly-reliable one-chip configuration 


solution with the following features: 


¢ 1-Mb to 16-Mb densities. 


e Simple, fast, serial FPGA interface that 


requires only one user I/O pin. 


¢ Parallel configuration up to 264 Mbps 
(17V16 and 17V08 only). 


e Available in SOIC, VOIC, VQFP, and 
PLCC packages. 


¢ Low-power CMOS floating gate process. 


¢ Programming support by leading program- 


mer manufacturers. 


¢ Cascadable for storing longer or multiple 


bitstreams. 


nual 
pa a 
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Device 


Density 





X(17V02 
XC17V04 


XC17V16 


Table 1 - Virtex PROM packages 


XCV300E 
XCV400E 
XCV405E 
XCV600E 
XCV812E 


w W W W W Ww Ww Ww 


w W W W W Ww 
w W W Ww 
w W W Ww 
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XCVIO00E Wy YW 
XCVI1600E Wy YW 
XCV2000E Ww 
XCV2600E Ww 
XCV3200E Ww 


Table 2 - Number of Virtex-E FPGAs 
configurable by one 16Mb PROM 
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Solution 











Table 3 - Spartan-II PROM packages and 
device compatibility 


The Most Cost-effective Solution 


The new XC17V00 family also offers signif- 
icant savings in board space, design time, 
and cost. Using one 17V16 to configure the 
new 3.2 million system-gate Virtex 
XCV3200E FPGA requires less than one 
fourth the board space of any previous 
Xilinx configuration PROM solution. To 
get the equivalent functionality from our 
nearest competitor would require 14 chips 
and more than 2x the board space, as illus- 


trated in Figure 1. 


Xilinx process expertise has also allowed us 
to use smaller packages, further reducing the 


need for board space. 
Configuration of Multiple FPGAs 


The XC17V16 can also be used to configure 
multiple, daisy-chained FPGAs. This allows 
you to store configuration data for up to 
eight FPGAs in a single PROM, as illustrat- 
ed in Table 2. 


Spartan-II Configuration PROMs 


Our XC17SO0A PROM 
Family provides a high-per- 


formance, low-cost configu- 





ration solution, optimized 
for use with Spartan-II FPGAs. This family 
offers a dedicated PROM for each gate den- 
sity in the Spartan-II family for ease-of- 
selection and guaranteed compatibility, as 
shown in Table 3. This family also offers 
extended availability of the smallest package 
offered by Xilinx, the 8-pin VOIC. 


Key Features 


e Simple, fast, serial Spartan FPGA interface 
that requires only one user I/O pin. 


e Available in DIP, VOIC, SOIC, and 
VQEFP packages. 
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e Advanced, low-cost CMOS process. 
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¢ Programming support by leading pro- 
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grammer manufacturers. 


a: ’ E tT ait J lata Conclusion 
een rasta ee 5 7 f Pro \ ih 
a ta ented Jcrvy With the new XC17V00 and XC17S00A 
a _ PROMS, there is no easier, faster, or less 
. ee expensive way to configure Xilinx FPGAs. 


Figure 1 - The Xilinx solution beats the competition For more information see: www.xilinx.com. 
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Applications 


Create Etticient a 
Using Virtex 0 Spartan FPGIs 


The Virtex and Spartan-{l LUTs, configured as shift registers combined with Xilinx True 
Duct Port™ RAM, give you a very compact, flexible 


FPGAs 


by Rotem Gazit 
Design Engineer, MystiCom LID. 
rotemg@mysticom.com 


A Finite Impulse Response (FIR) filter works 
by multiplying a vector of the most recent N 
data samples by a vector of coefficients and 
summing the elements of the resulting vec- 
tor. In every cycle the filter receives a new 
sample of data and shifts out the oldest 
sample. FIR filters are very common 

in FPGA-based Digital Signal 


Processing applications. 


The design concept described 
here is suitable for systems 
with relatively low input rates 
(0.5 to 8 MHz), 
require a FIR filter imple- 


which PCH ites 


mentation with hun- 
dreds of taps; this is 
common in modem 
and demodulation 


applications. 
e e var, ; 
FIR Filter Design ‘SUC 2 - Serigl FLR filter spy 
UCT; 
Concepts _ 


By examining the FIR block diagram in 
Figure 1, you can see that if the filter is 
implemented in a straight forward manner, a 
multiplier will be required for every filter tap 
(N multipliers for an N-tap filter). In addi- 
tion, an adder with N inputs will be needed 
to sum all multipliers outputs. However, if 
the data input rate is slower than the per- 
formance capability of the FPGA, the filter 


can be implemented much more efficiently. 
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Serial FIR Filters 


Assuming that the performance capability of 
the FPGA is M times faster than the data 
input rate, we will examine the case where 
M is = N (where N is the required number 
of filter taps). 


To implement a serial N-tap filter uses only 
one multiplier, a 2-input adder, and storage 
for the partial results and the filter input 


| 
— fF 





ALE 


8, and area-erticient FIR tilfer design platform. 
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samples. The input sample 


storage holds the last N 


input samples. For every new 
sample entering the filter, N 
multiply operations will be 
performed, each multiplying 
the filter coefficient by the 


respective input sample. 


The result of each multiply oper- 
ation is added to the partial result 
storage to produce a new partial 

result. This newly calculated par- 

tial result is then saved in the par- 
tial result storage by replacing the 
previous partial result. After N such 
multiply and add operations, the 
partial result storage content is driv- 
en out of the filter. The partial result 
storage content is then cleared to 
begin processing a new data sample. 
A block diagram of serial FIR filter 


structure is shown in Figure 2. 


The hardware responsible for the com- 

bination of multiplying, adding, and 

storing is called a MAC (Multiply 
Accumulate) unit. Due to the serial nature 
of the filter, the MAC will operate on M 
taps of the filter. In the case where N is 
greater than M, several serial filters can be 
chained together. The oldest data sample 
leaving the first filter in the chain is used as 
the new data sample in the next filter, and 
so on. The results of all the chain filters 


must be added together. 


Se  Ripications 


Implementing a Serial FIR Filter 


You can implement Serial FIR filters very 
efficiently in Virtex and Spartan-II devices. 
The design can be divided into three sepa- 
rate units: the coefficients bank, the MAC 


unit, and the input sample storage. 
Coefficients Bank 
The Virtex block RAM can be used to hold 


the filter coefficients. No multiplexer is 
needed; all you need is a simple cyclic 
counter used as an address generator. In 
systems where a host DSP or an adaptation 
mechanism is present, the block RAM can 
be configured as a dual port RAM, 
enabling the coefficients to be dynamically 
changed during the normal filter opera- 


tion. 
MAC Unit 


The MAC unit consists of an adder, a mul- 
tiplier, and result storage. Careful design of 
the adder and multiplier is very important 


for area efficiency. 


Theoretically, the result of a 2* tap filter, 
which has 2” bits on every input data and 
2’ bits on every coefficient, will be joeye) 
bits wide. In real world applications how- 
ever, the number of bits in the result is usu- 
ally much smaller because the least signifi- 
cant bits of the result are usually ignored in 
the final result, after processing. It is very 
important to throw away those unnecessary 
bits as early as possible in the data process- 


ing (in the MAC multiplier and adder). 


An example MAC implementation is 


shown in Figure 3. 
Input Samples Storage Unit 


The input data storage unit can be imple- 
mented very efficiently in Virtex devices 
using the LUTs as shift registers. Each 
MAC, operating on M taps of the filter, 
requires an input data storage of M-1 stage 
delay line. During the first M-1 cycles, the 
delay line output is driven both to the 
MAC and back to the delay line input. 


In the Mth cycle, the delay line output is 
driven only to the MAC, and the new 
input data sample enters the delay line. If 


several filters are chained together, then the 


MAI 
// Name:mac Aare 
// Throwing away bits in the MAC can 


// Target device: sometimes lead to different results than 


i} . . 
// Module description: you get from throwing away the bits 
// 


7 aaaae from the final result; a thorough discus- 
// MAC of 16 bit coefficient by 5 bit inout data_sample. ; 
// the result is 22 bits wide n of the effect of such an operation on 
// 


the filter performance is beyond the 
// Parent: 
i} 
// filter_top 
iI 


scope of this article. 





// childrens: 


//mac_adder.v ,mac_multiplier. 
ccc 


module mac (coefficient,data_sample,rst,clk,enable,new_data,out); 


input [15:0] coefficient; —_//filter coefficient coming from coefficient storage 
input [4:0] data_sample; //filter data_samplescoming from samples storage 
input clk,enable,rst; 
input new_data; //indicates a new data sample. new_data goes high for one cycle 
/levery 64 clocks, 3 clocks after the new data arrives 


//Because of MAC pipeline. 


output [21:0] out; = // MAC output. 
reg [21:0] out; // MAC output changes whenever a new data is being processed. 


wire [16:0] mul_out; // mac_multiplier output. 
wire [21:0] add_out; // mac_adder output. 


reg [21:0] add_out_d; // sampled mac_adder output. 
reg [16:0] mul_out_d; // sampled mac_multiplier output. 


mac_multiplier mac_multiplier(.coefficient(coefficient),.data_sample(data_sample),.mul_out(mul_out) ); 


always @(posedge clk or negedge rst) // sample the multiplier outout 
begin // to improve timing 

if (rst) 

mul_out_d <= #2 17’b0; 

else 

mul_out_d <= #2 mul_out; 
end 


mac_adder mac_adder(.adder_out(add_out),.adder_in_O(mul_out_d),.adder_in_1(add_out_d) ); 


always @(posedge clk or negedge rst) // sample the adder output 
begin // this is the “RESULT storege” 
if (rst) 
add_out_d <= #2 22b0; 
else 
if (new_data) // clear accumulator for new data processing 
add_out_d <= #2 220; 


& 
add_out_d <= #2 add_out; 
end 


always @(posedge clk or negedge rst) // MAC output changes only when a new data arrives 
begin 


else if (enable & new_data) 
out <= #2 add_out; 
end 


endmodule 


Figure 3 - An example MAC implementation 
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MILLIE Boe aS ouepue needs too Melt for M 
// Name:delay_line cycles before it is driven as an input to the 
a next filter in the chain. Sometimes 
fae (depending on the available resources 
H Module description: inside the device) it is better to imple- 
// delay line of 63 delays x 5 bit. ment the delay line using block RAM 
H Me hie feta ctu for 64-clock cycle before driven to the next configured is a simple FIFO. 

elay line in the cnain 
if An example of a LUT SRL16-based delay 
// Parent: 
// : line implementation is shown in Figure 4. 
// filter_top A diagram of the complete serial FIR fil- 
! ehildrene: ter is shown in Figure 5. 
[ishift5x63.v shift63.v Conclusion 


LLL LLL 


| FIR filters with many hundreds of taps 
module delay_line (new_data_sample,clk,rst,enable, new_data, mac_data,next_mac_data); 


can be implemented easily even in the 


input [4:0] new_data_sample; // new_data sample smallest members of the Virtex and 
input clk,enable,rst; ; - ; 
input new_data; // new_data is active every 64 cycle for one cycle -> Sper anell Ese senliess Dye tases 
// SR mux control (input from it’s output OR new_data_sample) advantage of the Virtex and Spartan-II 
output [4:0] mac_data: // data for MAC architecture, you can implement FIR fil- 
output [4:0] next_mac_data; // data for next MAC in chain ters very efficiently. 
reg [4:0] next_mac_data; // Hold next_mac_data back for one MAC cycle 
// (64 clock cycles) 


wire [4:0] mac_data; 


shift5x63 shift5x63(.din(new_data ? new_data_sample : mac_data) , 
.clk(clk),.enable(enable), 
.dout(mac_data) 


) 


// Hold next_mac_data back for one MAC cycle (64 clock cycles) 
always @(posedge clk or negedge rst) 
begin 

if (!rst) 

next_mac_data <= 5’b0; 

else if (new_data & enable) 

next_mac_data <= mac_data; 





end 
endmodule 
module shift5x63 (din, clk,enable, dout); 
input [4:0] din; 
input clk,enable; 
output [4:0] dout; 
shift63 bit0(.din(din[0]), .clk(clk),.enable(enable), .dout(dout/0})); 
shift63 bit1(.din(din[1]), .clk(clk),.enable(enable), .dout(dout[1])); . . . 
shift63 bit2(.din(din[2}), .clk(clk),.enable(enable), .dout(dout[2])): Figure 5 - Serial FIR filter implementation 
shift63 bit3(.din(din[3]), .clk(clk),.enable(enable), .dout(dout[3])); 
shift63 bit4(.din(din[4]), .clk(clk),.enable(enable), .dout(dout/4])); 
endmodule ° 
- About MystiCom 
module shift63 (din, clk,enable, dout); 
input din, clk,enable; Sarr.. Founded in 1997, MystiCom is dedicat- 
output dout; //Synplify automatically infers om 
| //SRL16 for shift register with no reset ed to providing DSP and mixed-signal 
o [62:0] shifter; ' VLSI cores for high-speed communica- 
Rec eH geso0.g8 al tions. The company’s first product line 
! tae. implements the physical layer (PHY) for 
shifter[62:0] <= {shifter[61:0},din} - Local Area Networks (LANs) using Fast 
end Ethernet and Gigabit Ethernet proto- 
end 3 : : 
assign dout Seneneae cols. MystiCom is aecdlquaniered in 
endmodule Netanya, Israel, and has marketing and 


customer support offices in Mountain 


View, Calif. Additional inf 
Figure 4 - An example of a LUT SRL16-based delay line implementation mpi if. Additional information can 
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LogiCORE PCI Module Is a Key 
Element in Voice over IP Applicatio 


Silicon & Software Systems provides an elegant solution to Nortel Networks, using 







Xilinx LogiCORE PCI module implemented in a Spartan-XL FPGA. 


by Dara Hurley 

Director, Hardware Systems Division 
Silicon & Software Systems 
dara.hurley@s3group.com 


Voice over Internet Protocol (VoIP) offers 
companies and consumers enormous poten- 
tial cost savings compared to traditional 
switched telephone networks. This emerging 
technology is enabling low-cost international 
telephony and remote teleworking (also 


known as telecommuting). 


Nortel Networks, a pioneer in VoIP, has 
employed a LogiCORE™ PCI module in a 
Xilinx Spartan™’-XL FPGA to improve serv- 
ice. The system architecture is shown in 
Figure 1. The system is based on the popular 
PC architecture. The network adapter 
receives data (containing compressed speech) 
from the Internet and passes the data to the 
DSP compression/decompression engine via 
the PCI bus. The digital speech is then rout- 
ed back through a time-switched FPGA to 


the telecommunications network. 


In the other direction, digital speech is routed 
from the telecommunications network via 
two Xilinx FPGAs to the DSP engine. An 
x86 CPU controls the system. The DSP 
engine uses a system memory, which is con- 
nected to the CPU local bus. The CPU pro- 
vides IP packet processing, and data is trans- 
ferred from the system memory to the net- 


work adapter using DMA on the PCI bus. 


“Critical to the system performance is the 
PCI implementation,” said Eugene Garvin 
development manager of Nortel Networks 
“the DSP bus operates at a much slower speed 
than the PCI bus, so the realization of the bus 


must be optimized.” 


Initially, Nortel used a simple approach: Data 
transfer was routed through the CPU and 


Figure 1 - Architecture 
of Voice over IP network 


wait states were inserted in the PCI transac- 
tions to compensate for the different data 
rates. [his approach, however, consumed too 


much of the PCI throughput, Garvin stated. 


This is where Silicon & Software Systems 
(S3) stepped in and provided a more elegant 
solution. Silicon & Software Systems 
designed a DMA controller and FIFO data 
buffer, and integrated these along with a 
Xilinx LogiCORE PCI module into a 
Spartan-XL FPGA device (XCS40XL). The 
device also contained an interface between 
the time-division multiplexed (TDM) 
speech data from the telecommunications 
network and the data presented to the DSP 
compression/decompression engine. The 
used Spartan SelectRAM™ 
memory to create the dual-ported RAM- 
based FIFO buffer. 


designers 


“Use of DMA and data buffering over the 
PCI bus has freed up the processor to do 
other necessary tasks,” Garvin reported. “It 


— 
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also makes better use of the available PCI 
throughput, because it uses zero-wait-state 
burst transfers instead of one-word-delayed 


transactions.” 





FPGAs 


Creating Finite 
SSrerremicnentiares 





Using True Dual-Port Fully 
Synchronous SelectRAM Blocks 


Create very dense, high-pertormance, highly efficient designs that require no logic resources. 


by Edgard Garcia 
Senior Engineer, Multi Video Designs 
edgard.garcia@mve-fpga.com 


The latest Virtex, Virtex-E, and Spartan-II 
FPGA families offer a broad range of unique 
features, including block RAM, that give you 
dramatic speed and density improvements. 
The dedicated RAM blocks allow you to 
build fast and dense bidirectional data 
buffers and FIFOs, with built-in data width 
conversion. This RAM can also be used to 
implement very fast and efficient sequencers 
and Finite State Machines (FSMs), which 


frees your logic gates for other tasks. 


A well known approach to building 
sequencers consists of a ROM-based design 


with output registers. The same method can 
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4K-bit block 
SelectRAM™ which incorporates output 


apply to the Virtex 


registers. You can use a single 4K-bit RAM 
block as a 512 x 8 clocked ROM, to imple- 
ment a very fast FSM working at more than 
150 MHz, and it uses no CLBS. You can 


implement the following, for example: 


° 16 states + 4 additional outputs and 5 
inputs + Enable and Synchronous Reset. 


¢ 32 states + 3 additional outputs and 4 
inputs + Enable and Synchronous Reset. 


Design Example 


The following example shows how to imple- 
ment FSMs or sequencers with a single 4K- 


bit block SelectRAM. The same method can 


easily be expanded to more complex designs 
that could require two or more blocks. 


Synchronous FSMs and sequencers have 


some important characteristics in common: 


¢ They are clocked by a single clock. 


¢ A feedback path allows you to (partially) 
define what the next step will be. 


¢ They may need a clock enable to suspend 


the operations. 


¢ They must have a reset to go back to a pre- 
defined state. 


Figure 1 shows a typical FSM or sequencer 


logic diagram. 
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Simulation 


Converting the FSM behavior to a truth 
table can be a very tedious and time con- 
suming task; debugging and modifying the 


design can turn into a nightmare. 


The method used here takes advantage of the 
modern design entry, simulation, synthesis, 
and implementation tools, combining their 
complementary respective power. The goal is 
to use a VHDL simulator to automatically 
generate the constraint file for initialization 


of the block SelectRAM used as ROM. 


One of the problems encountered when sim- 


ulating a design with VHDL, is that an out- 





put can't be forced to any logic level. An easy 
way to avoid this inconvenience consists of 
breaking the feedback loop, which allows 
you to enter patterns into the inputs, includ- 
ing current and illegal states. Figure 2 illus- 
trates one way of designing the FFM VHDL 
code so it can be simulated more easily. A 
top-level file can provide the feedback for a 


classical simulation of the FSM. 
Optimization 


If the number of inputs (including binary 
encoding states feedback) is nine or less, and 
the number of outputs (including binary 
encoding states) is eight or less, a single 4K- 
bit block SelectRAM can be applied by using 
structural VHDL with the scheme shown in 
Figure 3. 





RAM Initialization 


Figure 2 - Modified FSM simplified block diagram By initializing the contents of the memory 
with the appropriate values, the behavior of 
any synchronous FSM can be reproduced. 
Binary encoded state The initialization of the RAM block is done 
by an NCF (constraint) file that will be used 


by synthesis and Xilinx implementation 


tools. A very easy way to initialize the mem- 





InP uls ADDRI4:0] ory with the correct values is to make an 

automatic generation of the NCF file, by 

WE DO[3:0] Outputs using a VHDL simulator and another test- 

Tied to GND 

DI[7:0] bench. 

a Consider the behavioral VHDL code of the 

EN FSM without feedback. A simple 9-bit pseu- 

RST RAMB4 S8 do counter (generated by a testbench) can 

(512 x 8 primative) provide all the 512 possible states of the 

inputs, including illegal states. Each associat- 

Figure 3 - Using fully synchronous RAM blocks for FSM implementation ed result (state output and FSM outputs) can 
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thus be converted to a text file obeying the 
NCF file format. See Figure 4. 


Resources Required 


by taking advantage of the innovative fea- 
tures such as block SelectRAM. By combin- 
ing the power of the Xilinx architecture and 
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design productivity is greatly improved, and 
complex designs can be easily implemented 


or modified. 


implementation tools, with the associated 
For more information, please e-mail: 


Table 1 summarizes some examples of typi- VHDL synthesizers and simulators, your 


Edgard. garcia@mvd-fpga.com 


cal FSMs or sequencers in terms of perform- 
ance and logic resources. All these designs 
can be implemented with a single RAM 
block, but the same method can easily 
expanded to more complex functions, pro- 
viding similar improvement. As you can see, 


a RAM-based FSM is much faster and uses 


no logic resources. 
True Dual-Port RAM Advantages 
Block SelectRAM provides true dual port 


capability; a single location can be read at 
the same time by the two ports. Therefore, a 
single block can be used to implement two 
identical synchronous FSMs, with separate 
inputs, synchronous reset, and clock enable; 


you can also implement separate clocks, if 






Figure 4 - Testbench for automatic 


needed. Figure 5 shows the architecture of a 
NCF file generation 


Binary encoded states of FSM_A 


Number of FFs Number of Slices RAM Blocks Speed* 


Implementation mode 


One Hot Encoding 





Inputs_A 
Binary Encoding $e 8) 71700) 
SelectRAM block _- WEA Ye NER) Outputs_A 
(binary encoding) Tied to GND 
a DIA[7:0] 
¢ Results for a single 16 (or 32) state synchronous FSM with Enable, synchronous Glaclet 
Reset, 5 (4) inputs and 4 (3) outputs. Enable A CKA 
Rese aah RAMB4_S8_S8 
Table 1 - Single FSM implementation comparisons (512 x 8 DPR primative) 
Inputs_B 
—___ pp ) 151760) 
Implementation mode Number of FFs Number of Slices RAM Blocks Speed* WEB 
Tied to GND 
One Hot Encoding 40/70 60...120 80...110Mhz —_ RAN DOBJ3:0] Outputs_B 


Clock_B 
Enable B 
Reset_B 


CKB 
ENB 


RSTB DOB[7:4] 





Binary Encoding 16/16 50...100 = 70...100Mhz 


(binary encoding) 
*Dual 16 (or 32) states synchronous FSM with Enable, synchronous Reset, 5 (4) 
inputs and 4 (3) outputs. 


ADDRBJ[8:5] 
Binary encoded states of FSM_B 


Figure 5 - Dual FSM implementation 


Table 2 - Dual FSM implementation comparisons 


dual FSM using a single dual-port block 
SelectRAM. Table 2 shows a dual FSM 


implementation comparison. 


About Multi Video Designs 


MVD is a design and training center, specializing in Xilinx FPGA/CPLD 


Conclusion designs and Hardware Description Languages. Consulting services and on 


site classes are offered in France, neighboring countries, and South America, 


The Virtex architecture provides powerful 


features and flexibility, allowing you to cre- in French, Spanish, and Portuguese. More information on our activity as 
b] 


ate very dense and high performance designs well as the source code of some examples are available on our website, at: 


www.mvdfpga.com/training/VHDL_examples.htm 
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Design a Low-Power 
SMBus System Using 
CoolRunner CPLDs 


The ultra low-power CoolRunner CPLD is the ideal 
choice for SMBus systems, because you can easily 
contigure it to suit your specitic needs. 


by John Hubbard 


Applications Engineer, Xilinx, Inc. 
iohn.hubbard@xilinx.com 


Low-power devices use the System 
Management Bus (SMBus) protocol to 
communicate with components and periph- 
erals. SMBus is a compatible derivative of 
the ?’C two-wire serial bus protocol and can 
therefore reside in the same device. In addi- 
tion to the I?C features, SMBus enhances 
systems designed for power management 
tasks. Because SMBus is often used in hand- 
held devices containing “smart” batteries, 
CoolRunner CPLDs are the perfect solution 
for implementing the low-power SMBus 


components designed into these batteries. 


A system that uses the SMBus protocol can 
pass information between components 
without the need for individual control 
lines. This passed information can contain 
manufacturer information, model numbers, 
part numbers, system status, control param- 
eters, errors, and so on. SMBus is so flexible 
you can even add or remove components 
during system operation. It also has the 
ability to determine arbitration in multi- 
master systems, and if a newly inserted 
device has the ability to check packets of 


data for errors. 
Functionality 


The SMBus implementation for the 
CoolRunner CPLD consists of a 
microcontroller or microprocessor 
SMBus 


master/slave controller as shown 


interface and an 


in Figure 1. It implements the 


following features: 
e Microcontroller interface. 
e Master or slave operation. 


¢ Multi-master operation. 





¢ Host operation. 

e Automatic mode switching between mas- 
ter and slave. 

¢ Calling address identification. 


¢ START and STOP signal generation and 


detection. 
¢ Repeated START signal generation. 


¢ Acknowledge bit generation and detec- 
tion. 


¢ Bus busy detection. 
¢ From 10 to 100 kHz operation. 
¢ Optional signal SMBSUS# for system 


suspend mode. 


¢ Optional signal SMBALERT# for slave 


interrupt request. 


¢ Packet Error Code (PEC) using 8th poly- 
nomial Cyclic Redundancy Check (CRC- 
8) methods. 


e Automatic determination of PEC capable 


devices. 


¢ Compliant with System Management Bus 
Specification Rev. 1.1. (Note: the new 
SMBUS 2.0 Specification was posted 
Aug. 3. 
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Figure 1 - Basic SMBus controller functions 


See http://www.smbus.org/index.html.) 


¢ Software selectable SMBus acknowledge 
bit. 


e Arbitration. 


A more detailed description of the 
SMBus controller is shown in Figure 2. 
You can easily modify the microcontroller 
interface to adapt the design to any 


microcontroller of your choice. 


This design was created in VHDL using 
Xilinx WebPACK™ ISE (Integrated 
Synthesis Environment). It was verified 
using ModelSim™ XE (Xilinx Edition) 


simulation software. 
Conclusion 


You can get started with your own 
CoolRunner CPLD SMBus design by vis- 
Xilinx 
www.xilinx.com/xapp/xapp353.pdf. 


iting the website at 


Download (for free) the following: 

¢ Complete detailed application notes. 
¢ Complete VHDL source code. 

¢ VHDL test benches. 





Figure 2 - SMBus controller detailed block diagram 
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These techniques can lower your CoolRunner power consumption by 40%. 


by Frank Wirtz 


Statf Applications Engineer, Xilinx Inc. 
frankw@xilinx.com 


With the advent of Fast Zero Power tech- 
nology and CoolRunner CPLDs, you can 
now create portable, high-performance, 
low-power, programmable devices, effort- 
lessly. And, with some additional effort, 
you can reduce your power consumption 
by as much as 40%. However, to accom- 
plish ultra low power reductions, you 
must first understand the mechanics of 


CPLD logic generation. 


Xilinx has published a new application 
note, “Low Power Tips for CPLD Design” 
(XAPP346), that describes design tech- 
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niques you can use to further reduce 
power consumption in CoolRunner 
CPLDs, which are already the lowest 
power-consuming CPLDs in the world. 
Here are some highlights from that appli- 


cation note. 
Tips and Tricks for Reducing Power 


The CoolRunner XPLA architecture gives 
you a flexible logic allocation tool that 
allows you to decrease power consump- 
tion by placing your logic in optimum 
locations. To use this tool, you need to 
understand the basic architecture of the 


device, and you need to know how the 


fixed geometry of the device determines 
both the speed-sensitive paths and the 


power-sensitive paths. 


The following design implementation 
techniques are just a sampling of the 
information you can use to slash power 


consumption to a minimum: 


‘Terminate! 


You must properly terminate all inputs to a 
CMOS buffer. A single floating pin can 
result in an increase of quiescent current by 
13mA. Slow input transitions will also 
cause unnecessary power use. Test data 
shows that input buffer power consump- 
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tion doubles if input rise time increases 
from 800ps to 5ns per input. 


Congregate! 


You can see how your design is imple- 
mented by reviewing your fitter report, 
and then adjust the fit to constrain your 
high frequency signals to a single logic 
block. This will decrease the distribution 
of high-speed nets and further decrease 
power consumption. 


Modulate! 


The application note details special clock 
considerations and explains how asynchro- 
nous clocking can provide low power ben- 
efits. Typically, asynchronous clocking 
increases power consumption. Modulation 
in this instance refers to only applying a 
clock signal to a register when it is required. 
Many designs have registers that infre- 
quently change state, yet the clock signal is 
continually present and applied to the reg- 
ister. While asynchronous design 
techniques are usually discouraged, 
they do provide designers with 
additional flexibility when low 
power (or sometimes high speed) 


characteristics are required. 


As an example of this technique, 
consider a counter circuit. In the 
case of a binary counter, not all of 
the registers change state on each 
significant clock edge. Designers can use 
a high speed clock for the LSBs of the 
counter, and then use a prescaled clock 
for the higher order bits, so the total 
amount of power required by the clock 


buffers is decreased. 
Mixed Voltage Interfacing 


When interfacing devices that have differ- 
ent VCC levels, consider the impact caused 
by under driving a CMOS input. Because 
a CMOS input buffer is comprised of at 
least two primary transistors, a P-channel 
pull up and an N-channel pull down, there 
exists a region of input voltage where both 
transistors are slightly on, and current flows 
from VCC to GND through these buffers. 
This causes power to be wasted, and since 
the output of this buffer may also be in the 
linear region, it can cause problems because 


other internal devices depend upon the 
output voltage of the buffer for driving 
their inputs. 


In some cases, mixed voltage interfacing is 
necessary. Slight modifications to differ- 
ing VCCs can drastically reduce power 
consumption in these instances. For 
example, the XPLA3 devices may be pow- 
ered at 3.3V +10%, and 5V devices may 
be powered at 5V -10%. This changes the 
differential between Voh and Vih by 
800mV per input, and will significantly 
reduce wasted power. However, examine 
the data sheet to ensure safe and reliable 


operating conditions. 
Default System Conditions 


Attention to default system operating con- 
ditions may provide an insight into ways 
you can further decrease power consump- 
tion. As an example, a CPLD may be inter- 
faced to a CMOS microcontroller with 
inter- 


programmable (polarity sensitive) 





rupts. If it is necessary to interface a 3.3V 
CPLD to a 5V interrupt, system power can 
be saved by programming the microcon- 
troller interrupt such that the system oper- 
ates with the interrupt level normally low. 
This decreases the amount of time that the 
interrupt is active (high) which will reduce 
the overall amount of power consumed 


when under driving a CMOS input. 
The Effects of Implementation Style 


Implementation style affects power con- 
sumption. For example, consider how dif- 
ferent types of counters are implemented; 
binary, Grey, and LFSR counters are cre- 
ated in different ways and require differ- 
ent amounts of resources. Keep in mind 
that a minimal number of changing sig- 
nals will always deliver the lowest dynam- 


ic power solution. 


Binary Counters 


Typical binary counters will have their out- 


puts changing state at a rate of: 





X 100 


9 n+1 _ ey) 
Percent of bits toggling = A 


Where the number of bits of the counter = n 


So, a typical 8-bit binary counter would 
have approximately 25% of its bits chang- 


ing state for any single clock edge. 
LFSR Counters 
LFSR (Linear Feedback Shift Register) 


counters are wonderful solutions for FPGA 
users who need to keep look-up table fan- 
in to a minimum. However, because of the 
internal hardwired feedback of CPLDs, 
this type of counter consumes much more 


power than the other counter examples 


described here. 


For example, an 8-bit LFSR counter has 
approximately 50% of its bits chang- 
ing on average for any single clock 
edge. In comparison, an 8-bit binary 


counter changes at a 25% bit rate. 
Grey Code Counters 


Because of their characteristic step 

pattern of a single changing bit, Grey 

code counters offer designers the low- 
est power consumption of these three 
counter methods. The average bit change 
rate for an 8-bit Grey code counter is 
approximately 13% as defined by the equa- 
tion: 





Percent of bits toggling = X 100 


The Grey code design implementation is the 
most difficult, however, because next-state 
information must be coded for each count 


value. 
The Bottom Line 


Take full advantage of the CoolRunner low 
power benefits by downloading the free “Low 
Power Tips for CPLD Design” application 
note (in PDF format) from the Xilinx website 
at: www.xilinx.com/xapp/xapp346.pdf. 
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Creating a Low Power Serial Perphe 


How to implement communications between microprocessors and perp als, 4 
using the Xilinx CoolRunner XPLA3 CPLDs. 


by Anita Schreiber 
Staff Applications Engineer, Xilinx 
anita.schreiber@xilinx.com 


The CoolRunner implementation of a Serial 
Peripheral Interface (SPI) Master described 
here can be used to add an SPI controller to 
microprocessors or microcontrollers that do 
not provide this interface. It will permit 
direct inter-processor communication and 
communication with numerous commer- 
cially available peripherals. 


Serial Peripheral Interface Protocol 


SPI is a full-duplex, synchronous, serial data 
link. A single SPI device is configured as a 
master; all other SPI devices on the SPI bus 


are configured as slaves. 
The SPI bus consists of four wires: 


¢ Serial Clock (SCK) - Driven by the SPI 
master and regulates the flow of data bits. 
The SPI specification allows a selection of 
clock polarity and a choice of two funda- 
mentally different clocking protocols on an 


8-bit oriented data transfer. 


Master Out Slave In (MOSI) - Data out- 
put from the SPI Master and input to the 
SPI Slaves. 


Master In Slave Out (MISO) - Data input 
to the SPI Master and output from the 
selected SPI Slave. Only one selected slave 
device can drive data out from its MISO 


pin. 


Slave Select (SS) - Selects a particular slave 
via hardware control. Slave devices that are 
not selected do not interfere with SPI bus 
activities. The SS control line can be used 
as an input to the SPI master indicating a 
multiple-master bus contention (SS_IN). 
If the SS signal to the master is asserted, it 


indicates that some other device on the bus 
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is attempting to be a master and address 
this device as a slave. Assertion of SS auto- 
matically disables SPI output drivers in the 
master device if more than one device 


attempts to become master. 


The SCK, MOSI, and MISO pins of all SPI 
devices on the SPI bus are connected togeth- 


er in parallel. 
CoolRunner SPI Master Implementation 


This SPI master design supports the follow- 


ing features: 
e Microcontroller interface. 


¢ Multi-master bus contention detection 


and interrupt. 
¢ Eight external slave selects. 


¢ Four transfer protocols available with 


selectable clock polarity and clock phase. 
e SPI transfer complete interrupt. 
e Four different bit rates available for SCK. 


A high-level block diagram is shown in 
Figure 1. The microcontroller (uC) interface 
is a VHDL module that you can 

easily modify to support other 
microcontrollers. 


The Address Decode/Bus 
Interface logic interprets 
the bus cycles of the 
microcontroller 
and performs the 
read/write opera- 
tions to the 
Register File. 

The Register 
File is the inter- 
face between 
the pC and the 
SPI master logic, 


and allows 










Inte 





ion of the SPI master. Status of the current 
transfer is provided to the pC via a status 
register in the Register File. Registers are also 
included to contain the pC data to be trans- 
mitted on the SPI bus and data received 
from the SPI bus. The SPI Control State 
Machine controls the shifting and loading of 
SPI data in the SPI shift registers, and the 
generation of the slave select signals. The 
SCK clock logic generates an internal SCK 
based on the settings in the control register 


for clock phase, division, and polarity. 
Conclusion 


CoolRunner CPLDs operate at the lowest 
standby power (<100yA) of any CPLD avail- 
able today, and they are an ideal program- 
mable logic solution for providing interface 
controllers in portable or power sensitive 
applications. See www.xilinx.com/apps/ 


epld.htm#CoolRunner for an SPI reference 


design which contains a detailed application 
note (XAPP348), VHDL source code, and 
VHDL testbenches. 


Figure 1 - CoolRunner SPI master 
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CoolRunner CPLDs Beat the Heat 


The disadvantages of high power components... 


by Steve Prokosch 
CoolRunner Product Marketing, Xilinx Inc. 
sprokosch@xilinx.com 


Conventional thinking assumes that high 
performance requires high power consump- 
tion, but the Xilinx XPLA3 CoolRunner 
CPLD family defies convention. In both 
ultra-low standby and total current con- 
sumption modes, Xilinx XPLA3 complex 
programmable logic devices consume less 
power than any other CPLDs in the world. 


By consuming the least amount of power, 
CoolRunner CPLDs radiate the least 
amount of heat. Devices that emit excessive 


heat can cause serious problems, such as: 
¢ Higher FIT (Failures In Time) rates. 
e Intermittent field failures. 


¢ Higher cabinet or enclosure design and 


manufactur ing costs. 


° Increased risk of EMI/RFI leakage (caused 


by extra cooling vents). 
¢ Mechanical stress to package parts. 
¢ Printed circuit board layout concessions. 


¢ Design compromises that affect the overall 


size and appeal of the end product. 
The High Costs of High-Power Components 


In addition to radiating destructive heat, 
devices that consume high amounts of power 
also generate extra costs. For instance, when 
you buy power supplies, the more power 
output you need to run the system, the high- 
er the cost. Many designers accept the high 
costs of power supplies-without ever ques- 
tioning the efficacy and efficiency of the 


devices demanding all that power. 


Furthermore, a device that consumes a lot 
of power may require the addition of 
another physical component, such as a 


larger cooling fan or a heat sink. Adding to 


the BOM (bill of materials) can be expen- 
sive. Besides the cost of the component 
itself, additional expenditures of time and 


money can be incurred: 
¢ Locating and ordering the component. 
¢ Shipping and delivery costs and delays. 
¢ Controlling inventory. 


¢ Dealing with the availability of the phys- 


ical device. 


Availability can be a real budget-buster; if 
that one single component cannot be 
obtained or delivered in a timely fashion, the 
entire system cannot be shipped to the cus- 
tomer. Assembly costs continue to rise until 
all components are delivered, installed, and 
shipped. Such a delay can turn a company’s 


bottom line upside down. 
The CoolRunner Reliability Advantage 


Heat plays an enormous role in determin- 
ing the reliability of your designs in the 
field. Because most semiconductor 
devices are tested for hot temperature 
operating life (HTOL), it is easy to com- 


pare how long a product would last under 


high temperature conditions. 





Degrees Centigrade 


Table 1 - CPLD thermal emissions 


Consider how well a CoolRunner XPLA3 
256-macrocell device performs in a Thin 
Quad Flat Package. The TQFP has the 
worst thermal characteristics of any avail- 
able CPLD package. A worst-case analysis 
shows a CoolRunner XPLA3 256-macro- 
cell device would run for 107 years at a 


constant 145 degrees Celsius! 


As shown in Table 1, all other CPLD 
products, even in their low-power modes, 
radiate more heat than CoolRunner 
CPLDs. (For more information on how 
these measurements were obtained, see 
the Xilinx Thermal Emissions Web page 
at: www.xilinx.com/products/cpldsolu- 
tions/techtopic/thermalimg.htm.) 


Conclusion 


Xilinx CoolRunner XPLA3 CPLDs are 
designed for high-performance, low- 
power products such as portable PCs, 
PDAs, and handheld wireless devices 
where heat can be a critical factor in form, 
fit, and function. If heat is the problem, 
CoolRunner CPLDs are the solution. 


m@ Ambient 


@ Xilinx XCR3256XL-7TC44 


m Cypress CY37256VP160-100AC 


m Lattice M4LV-128/64-10YC 
m Altera EPM7256AETC1477 
m Altera EPM3256ATC1447 





m@ Lattice ispLS12192VE-100LT128 


43 


44 


Reliability 


FPGAS — 

The Solution 

to Ultra-Deep 
Sub-Micron Design 


With Xilinx FPGAs you can focus on your 

design function without being concerned with 

the tricky physical design issues caused by today’s 
ultra-deep sub-micron device geometries. 





by Austin Lesea 
Principal Engineer, Xilinx 
austin@xilinx.com 


Sudip Nag 
Manager Implementation Tools Engineering, Xilinx 
sudip@xilinx.com 


Hitesh Patel, 
Manager Alliance EDA Marketing, Xilinx 
hitesh@xilinx.com 


As device geometries decrease from 0.25p 
to 0.13p, the problems of substrate cou- 
pling, ground bounce, and interconnect 
crosstalk increase dramatically. ASIC 
designers who want to take advantage of 
these new device technologies are faced 
with a difficult task; they must create 
designs that are both logically correct and 
reliable within the specified environmen- 
tal extremes. As device geometries 
decrease, it becomes much more difficult 


to produce reliable designs. 


The FPGA Solution 
The FPGA solution (consisting of both 


devices and software) assures you of a reli- 
able design, because FPGAs are composed 
of a consistent architecture that has been 
tested over a long period of time, under 
real world conditions. Thus, FPGAs are 
guaranteed to operate exactly as specified, 
with very predictable interconnect delays, 
so you can spend more time on design 


optimization. 
Substrate Coupling 


Substrate coupling is a serious problem for 
either an ASIC or an FPGA. To guarantee 
performance, both ASIC and FPGA IC 
designers must use the best tools and 
models possible to accurately simulate the 
device, and have the means to automati- 
cally verify the design once it is ready for 
mask making. 


Xilinx has developed proprietary automat- 
ed techniques similar to those used by 
EDA vendor CadMOS (used in their 
SeismIC™ and PacifIC™ products for tra- 
ditional DSM ASIC designs). We use 
these tools to automatically model and 
verify all of our FPGAs, which isolates you 


from this issue. 





| Reliability | 


Substrate Bounce 


Substrate bounce is caused by the switch- 
ing of fast, high-current I/O transistors. 
It can cause double-clocking, as well as 
indeterminate and invalid logic states. 
The substrate bounce effects on Xilinx 
FPGAs are modeled precisely, and our 
designs are guaranteed to provide ade- 
quate isolation. For all of our FPGAs, 
Xilinx models and minimizes substrate 
noise in the design prior to making the 


device masks for fabrication. 
Interconnect Crosstalk 


Interconnect crosstalk between the 
tiny wires, on multiple metal lay- 
ers, now requires a 3D field 
solver for extraction; anything 
less is not accurate enough to 
completely model the inter- 
connections on the chip. 


Models 


account the potential for 


must take into 


crosstalk induced delay so 
that any possible user circuit 
will behave predictably and 
reliably, regardless of process 
and silicon variation. Xilinx IC 
designers perform extensive 
interconnect modeling using field 
modeling to ensure that our FPGAs 


do not have crosstalk problems. 
Device Fabrication 


IC Designers must also model devices 
carefully to avoid yield problems, speed 
grading issues, and design failures. Xilinx 
manufactures millions of devices for each 
FPGA family, and the manufacturing 
process utilizes device monitors, test 
structures, and other process related struc- 
tures that are measured on every wafer. 
The models are incrementally refined so 
that all process corners are modeled. The 
use of highly accurate device models 
enables Xilinx IC designers to rigorously 


verify and characterize all of our devices. 


The combination of extensive and accu- 
rate modeling, simulations, and verifica- 
tion requires hundreds of engineer-years. 
Xilinx has made the investment and pre- 


engineered all of our devices so you don't 


have to worry about whether the silicon 
will work. ASIC IC designers must also 
expend the same effort, but often the 
expense is too high and resources are 
inadequate, resulting in designs that are 
not reliable. 


The Software Solution 


Using the current ultra-deep sub-micron 


design rules, interconnect delays account 


for approximately 75% of the path 









delays. Therefore, design modeling and 


timing closure become significant factors, 


and development tools are critical. 


For FPGAs, interconnect delays have 
always been a significant portion of path 
delays because of the existence of switches 
in the routing paths. However, FPGAs 
have specific, fully characterized routing 
resources and, therefore, accurate delay 
models can be achieved using exhaustive 


empirical methods and functional analysis. 


FPGA software tools are extremely 
mature in the area of handling intercon- 
nect delays. Therefore the ultra-DSM 
technology does not pose a new problem 


for FPGA place ama 
Specifically, the delay estimation methods 


route tools. 


are mature and sophisticated resulting in 
accurate delay prediction. The placement 
and routing tools are smart enough to 
determine when to update the 
timing/slack information dynamically, 
based on changing interconnect delays 


during layout. 


The newer Virtex and Spartan-II genera- 
tion FPGAs, are co-developed by the 
Xilinx software and hardware teams. This 
process naturally results in a highly pre- 
dictable architecture, and predictable 
interconnect delays. This allows the soft- 
ware to make correct placement and 
routing decisions early in the design 
process, even during the synthesis 
phase where there is maximum 
flexibility to influence the 


design performance. 


The quality of the synthesis 
wireload models plays a key 
role in the timing pre- 
dictability after synthesis. 
Advancements in FPGA syn- 
thesis technology (such as 
improved wire delay estima- 
tion by synthesis-driven place- 
ment tools) enable highly accu- 
rate timing predictability, and is on 
average 20% to 25% more accurate 
than ASIC technology. In addition, re- 
synthesis capabilities for critical path 
optimization reduce the number of 
design iterations for faster time to timing 


closure. 
Conclusion 


With FPGAs, you can now implement 
multi-million gate designs without being 
plagued by the DSM problems inherent in 
ASICs. Our advanced FPGA architectures 
shelter you from physical design issues 
such as crosstalk and ground bounce, and 
the latest synthesis and implementation 
software delivers timing predictability 
early in the design flow, giving you a sig- 
nificant reduction in design closure time, 


and a shorter time to market. 
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Use the Vitex o Sprtan True Dual-Port™ RAM and DLLs to create a reaHime histogram. 


by Edgard Garcia 
Engineer, Multi Video Designs 
edgard.garcia@mvd-fpga.com 


Image processing is key for many automated 
industrial inspection applications. However, 
even the most sophisticated algorithms can't 
extract the right information if the image 
contents are not available in a convenient 
format. By using a histogram, you can ensure 
that the image content can be easily 


processed. 
What is a Histogram? 


For each possible pixel value, the histogram 
algorithm counts the number of times the 
value was encountered in the current image. 
For example, the histogram of an 8-bit-per- 
pixel image will contain 256 values (2°), each 
one representing the number of pixels found 
at this value. This allows a microprocessor or 
DSP to quickly get the profile of the image, 
and take the appropriate decisions, by ana- 
lyzing just those 256 precomputed values. 
You can do this easily, in real time and at low 


cost, in a Virtex or Spartan-II FPGA. 
A Basic Hardware Implementation 


For an 8-bit-per-pixel image, 256 different 
values are possible for each pixel, so 256 16- 
bit counters would be necessary to complete 


the real time histogram. However, only one 
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of the 256 counters will be active at each 
valid pixel clock (only one value will be 
updated). Therefore, the registers of the 256 
x 16 bit counters can be replaced by a mem- 
ory array, such as a 4K-bit block 
SelectRAM™ organized as 256 x 16 
(RAMB4._ S16). 





Figure 1 - Basic hardware implementation 





A 16-bit incrementer will allow you to 
update the RAM contents during a Read- 
Modify-Write operation, where the video 
data inputs are used as the address of the 
memory block. Figure 1 shows the block dia- 


gram of a basic hardware implementation. 


plications 


Optimized Implementation 


Each memory cycle can be either a Read 
or a Write, so we need to divide each pixel 
clock cycle in two sub-cycles: a Read cycle 
for getting the current value, and a Write 
cycle for updating (+1) the memory con- 
tent. You can do this easily, using a 
CLKDLL to recover a clock at twice the 
frequency of the video clock (CLK2X), 
and to create an image of this clock shift- 
ed by 90° to validate Read and Write 
cycles. Figure 2 shows the detailed dia- 


gram of an optimized implementation. 


During horizontal and vertical retrace, 
pixel values must be discarded. This is 
done with no additional logic, by con- 
necting the BLANKING# signal to the 
ENA input of the memory block. Figure 


3 illustrates the timing of the operations. 


The DSP or microprocessor can directly 
read the result of the histogram by using 
the B port of the same block SelectRAM 
(configured as a RAMB4 S16_S16). A 
multiplexer is not needed because the two 


ports (A and B) each have dedicated inputs. 


CLK2XxX 


CLK90 


VIDEO_IN 
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Figure 3 - Timing 
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Resources and Performance 


Here are the logic resources required for 


implementing the histogram algorithm: 

e 1x CLKDLL+ BUFG 

e 1x RAMB4 S16_S16 

e1lx 16-bit INCREMENTER (8 slices) 


For a Virtex -6 or Spartan-II -6 device, Fpix 
= 50 MHz. 
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Figure 2 - An optimized implementation 


Conclusion 


By taking advantage of the high level fea- 
tures of the Virtex and Spartan-I] FPGA 
architectures, you can greatly increase the 
speed and reduce the cost of your designs. 
For more information about how to imple- 
ment the histogram algorithm, e-mail: 


edgard.garcia@mvd-fpga.com 
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High Performance 


Digital Down-Converters 
Tor FPGAS stamens, 








"Applications Pee ce on Nele|fo) 


by Ray Andraka 
President, Andraka Consulting Group, Inc 
ray@andraka.com 


Digital down-converters (DDC) are a key 
component for digital radio. The DDC per- 
forms the critical frequency translation need- 
ed to recover the information from a digi- 
tized modulated signal. 


Thanks to the high-level of interest in digi- 
tal radio, the market for DDC devices is 
soaring. ‘Typically, a designer will select an 
off-the-shelf application-specific-standard- 
part (ASSP) for this task. Although the costs 
of these parts have fallen precipitously in the 
face of market demand, ASSPs don’t offer 
the design flexibility or integration attain- 


able in an FPGA. 
ASSP vendors are stuck with the challenge of 


creating a one-size-fits-all design, and end 
users are stuck with fitting the device to their 
needs—often paying for features or perform- 
ance they dont need or want. DDCs imple- 
mented in FPGAs, however, can compete 
with ASSPs by offering the additional bene- 


fits of customizability and higher integration. 


A down-converter consists of a numerically 
controlled digital oscillator, a mixer (shown 
as a pair of multipliers), and a low pass fil- 
ter, as shown in Figure 1. The band-limited 
output from the filter allows us to reduce 
the sample rate by decimating. The design 
is fairly straightforward, although we must 
to pay attention to the fidelity of the digital 
sinusoid-sine and cosine waveforms pro- 
duced by the numerically controlled oscilla- 
tor. We must also consider the quality of the 
filters, if we are to have acceptable noise 
performance. (We must keep the design 
from adding so much noise to the incoming 
modulated signal that we can't reliably 
detect it. How much noise is acceptable 


depends on the application.) 


Some digital radio applications have fairly 
high sample rates, which can make the 
design more challenging. With careful 
design, however, modern FPGAs can han- 
dle data as fast as any commercially avail- 
able analog-to-digital converter can supply 
it. The advantage of using an FPGA is that 


it allows us to customize the DDC to exact- 


ly match our application. Furthermore, 
with an FPGA implementation, we can put 
the DDC and any post-processing in the 
same chip. Post-processing is usually some 


form of demodulator. 
The Oscillator 


In terms of system performance, the criti- 
cal component in digital down-conversion 
is the numerically controlled oscillator 
(NCO). This component generates a sam- 
pled digital sinusoid, which when mixed 
with the incoming signal, shifts the sig- 
nal’s spectrum. In other words, if we mul- 
tiply (mix) a signal with a sine wave, we 
get a frequency translation or “shift” of 
the spectral image. The amount of transla- 
tion is equal to the frequency of the “car- 


rier’ sine wave. 


Insufficient precision or accuracy in the 
sinusoid leads to degraded signal-to-noise 
ratios and to spurious spectral artifacts, 
either of which can swamp the incoming 
signal. Attention to the quantization that 
leads to these noise terms is essential for 
the proper design of an NCO. In our 
implementation model, our NCO con- 
sists of a phase accumulator frequency 
synthesizer and a phase angle-to-wave 
shape conversion. The phase angle-to- 
wave shape conversion circuit may be any 


one of several possible designs. 





Figure 1- Digital down-converter 
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Frequency Synthesizer 


The frequency synthesizer is simply an accu- 
mulator used to integrate a phase increment 
value. If we interpret the MSB (most signifi- 
cant bit) of the accumulator as having a 
weight of TT then the accumulator represents 
the fractional portion of the accumulated 
phase angle. Phase accumulator frequency 
synthesis is discussed in detail in Xcell 
Journal #31 in an article by Austin Lesea 
(www.xilinx.com/xcell/x13 1/xl131_32.pdf). 


Using a phase accumulator offers several 


advantages over other methods: 


e The synthesized frequency need not have 
an integer relationship to the sample clock, 
because modulo arithmetic preserves the 
fractional part of the accumulated phase on 
an overflow. This lets us set the local oscil- 
lator to an arbitrary frequency without 


changing the sample rate. 


¢ The phase increment value does not have 
to be a constant. By dynamically changing 
the increment value, we can easily modu- 
late the phase or frequency of the generat- 
ed signal. 


¢ Because 2% represents a full phase revolu- 
tion, this generator interfaces nicely with 
look-up tables for wave shape conversion. 
Nothing in the phase accumulator design 
will impair the noise performance of the 
NCO; reducing word width only restricts 
the frequencies that can be synthesized. 
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Noise is generated by an imperfect rendi- 
tion of the sinusoid at the output of the 
NCO. That noise can be phase errors 
(angular distortions) or amplitude errors. 
The phase accumulator generates only a 
phase angle, so there is no amplitude 
error. Errors caused by quantization of the 
phase increment can cause a frequency 


error, but not a changing phase error. 
Waveform Synthesis 


The phase accumulator produces a 
“wrapped” phase angle that must be con- 
verted to a sampled complex sinusoid. 
The accuracy of the conversion directly 
affects the noise performance of the 
DDC. The noise introduced by the NCO 
is caused by amplitude and phase errors, 
which manifest themselves as reduced sig- 
nal-to-noise-ratio (SNR) and degraded 
spurious free dynamic range (SFDR) 
respectively. Each additional bit of phase 
improves the SFDR by about 6dB and 
extra amplitude resolution adds to the 


SNR by about 6dB. 


The most obvious conversion circuit is a 
simple lookup table of sine values by 
phase angle, which is addressed directly by 
the phase accumulator. The phase resolu- 
tion determines the depth of the table, 


while the amplitude precision determines 


CORK 


rotate 
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the width. To keep the size of the table 
reasonable without sacrificing frequency 
resolution, we must truncate the phase 
accumulator output, using only the MSBs 
at the cost of degrading the SFDR. The 
size of a table grows exponentially with 
phase resolution, so for even moderate 
SFDR requirements, the table becomes 


larger than what we would like to use in 


an FPGA. 


Simple amplitude and phase symmetry 
allows us to reduce the table size by a fac- 
tor of 4 by reusing the first quadrant data 
for the other quadrants. The same table is 
used for the both sine and cosine values, 
so if clock cycles per sample permit, the 
same ROM can be read twice per sample. 
In Virtex devices, you can use the dual- 
port feature of the block RAM to simulta- 
neously obtain both the sine and cosine 
values from a shared ROM. Large ROMs 
in FPGAs are expensive in terms of 
resources used so, for phase resolutions of 
more than 8 to 10 bits, other methods 
should be used. 


The large ROMs can be avoided by algo- 
rithmically generating the sine and cosine 
on the fly. While that sounds difficult, 
there is a simple shift-add algorithm based 
on vector rotation called CORDIC 
(COordinate Rotation DIgital Computer) 
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Figure 2-FPGA implementation of a digital down converter 
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that makes this task fairly easy in hard- 
ware. (See www.andraka.com/cordic.htm 
for details on CORDIC.) The algorithm 
simultaneously generates a sine and cosine 
value by rotating a unit vector from the 
“T° axis to the desired phase angle using a 
series of successively smaller elemental 
rotations. The angles of those elemental 
rotations are specifically selected for a 
shift-and-add implementation. The “I” 
(real or in-phase) and “Q” (imaginary or 
quadrature) components of the rotated 
vector are proportional to the cosine and 


sine of the phase angle respectively. 
The Mixer 


The function of the mixer is to multiply 
the incoming signal by the locally generat- 
ed sinusoid to shift the spectrum of the 
signal. A straightforward implementation 
uses two multipliers, one each for the sine 
and the cosine. The multipliers produced 
by the CORE Generator tool can easily be 
used for this application. 


If we use CORDIC for the wave shape 
conversion, however, we can obtain the 
mixer function for free. The combination 
of the NCO and the mixer multiplies the 
incoming signal by cos(wt)-jsin(wt) = e!". 
Because the NCO and mixer generate a 


complex phasor, the net effect is to rotate 
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the incoming signal by a constantly 
changing phase angle. Rather than rotat- 
ing a unit vector to get I and Q scale val- 
ues, we can use the CORDIC to directly 
rotate the input signal. This eliminates the 
two multipliers and avoids the potential 


for additional quantization noise. 


A more subtle advantage to using 
CORDIC is that it actually rotates the 
vector rather than multiplying the compo- 
nents separately. This means it does not 
add noise to the signal other than the 
spectral spurs caused by the phase quanti- 
zation. The CORDIC hardware occupies 
about the same area as a pair of multipli- 
ers with the same input width in the 
Virtex architecture. Thus, in effect, we 
have a net area savings about equal to 
what we would have used for the sine and 
cosine wave shape conversion. The 
CORDIC rotator also accepts a complex 
input, so no additional hardware is need- 
ed for applications requiring a complex 


signal input. 
The Filter and Decimator 


The mixed signal has to be filtered to iso- 
late the portion of the spectrum containing 
the signal of interest. The filter typically 
has to be a narrow-band filter with a fairly 
high rejection of unwanted spectrum. This 
translates to an expensive filter if it is done 
at the input sample rate. Instead, we can 
use a multi-rate approach in which the sig- 
nal is first decimated to a much lower sam- 
ple rate using a less computationally inten- 
sive filter. Then the signal is cleaned up 
with a second more complex filter working 


at the decimated sample rate. 


High Ratio Decimator 


A high-ratio decimation can be performed 
very efficiently using a cascaded integrator- 
comb (CIC) filter. The CIC filter is a 
recursive implementation of the “boxcar” 
or moving average filter. The spectral 
response of such a filter is the sinc (sinx/x) 
function. In a CIC filter, the number of 
effective taps is an integer multiple of the 
decimation ratio, so the filter nulls alias 
onto the passband when the spectrum is 


folded by decimation. If the passband is 


sufficiently narrow, the rejection of the 
aliased image is quite good, much better 
than might be expected otherwise. We can 
also cascade several sections to lower the 
amplitude of the side lobes. The passband 
of this filter does exhibit a pronounced roll- 
off that usually must be corrected by the 
clean-up filter. Keeping the passband of the 
final filter narrow not only improves the 
alias rejection, but also makes the roll-off 


compensation easier. 


The advantages of using a CIC filter in this 


implementation are: 


¢ It is a computationally easy filter to real- 


1Ze. 


¢ The same filter structure works for a very 
wide range of decimation ratios by simply 
changing the timing of the clock enables 


on the comb section. 


¢ The filter response referred to the output 
sample rate is nearly independent of the 
decimation ratio, so one clean-up filter 


can be used for all decimation ratios. 


The gain of the CIC filter is a function of 
the decimation ratio. Therefore, a barrel 
shifter is required after the CIC filter in 
applications where the decimation ratio has 
to be changeable without changing the cir- 
cuit. This is an issue in an ASSP DDC, as 
it is a one-size-fits-all solution. Most of the 
time in FPGAs, we can hardwire the shift, 
or at worst, use a limited barrel shift, 
because we can customize the DDC for our 


application. 
“Clean-Up” Filter 
The output of the CIC filter has a sinc 


shape, which is not suitable for most appli- 
cations. A “clean-up” filter can be applied 
at the CIC output to correct for the pass- 
band droop, as well as to achieve the 
desired cut-off frequency and filter shape. 
This filter typically decimates by a factor of 
2 or 4 to minimize the output sample rate 
after the passband has been limited and 
shaped. An application-specific filter 
response, such as a raised cosine Nyquist 
filter, can either be combined into the cor- 
rection filter or be applied at a subsequent 
filter stage. The clean-up filter is compact- 
ly implemented using serial distributed 


arithmetic (see www.andraka.com/dis- 
tribu.htm for a tutorial on distributed 


arithmetic). 


Identical filters must be applied to both 
the I and Q channels. Even using the slow- 
est speed grade Virtex FPGAs, the DDC 
design described here can be clocked at 
more than 130 MHz if the design is care- 
fully executed and floor planned. This 
high potential clock rate permits us to 
time multiplex the I and Q data through 
the same filters by interleaving the I and Q 
samples on a clock-to-clock basis. Thus for 
very little additional overhead, we can 
handle both the I and Q data in the same 
filter. We can also use the same technique 
to handle several independently tuned 
channels with a single instance of the 


DDC design. 


An advantage of using an FPGA for the 
DDC is that we can customize the filter 
chain to exactly meet our requirements. 
With an off-the shelf chip, we would have 
to either fit our requirements to the chips’ 
features or add additional post-processing 
to modify the output to our needs. 


Conclusion 


We've briefly discussed implementation of 
a high performance DDC in an FPGA. If 
we apply these techniques to a 16-bit 
DDC with a 64 MS/sec input and a 100 dB 
SFDR requirement, we come up with a 
design that occupies about 550 Virtex 
CLBs (configurable logic blocks). The 
occupied area is heavily influenced by spe- 
cific requirements of the application. The 
cited design, shown in Figure 2, consists of 
an NCO and mixer implemented as a 
CORDIC rotator and a programmable 
decimating filter. The filter is a 4th order 
CIC filter followed by a 63-tap symmetric 
Finite Impulse Response (FIR) filter. 
Backing off on any of the requirements 
can substantially reduce the area occupied 
by the DDC. Because we are using an 
FPGA, we have the luxury of picking the 
features and performance to match our 
application. If we were to use an ASSP 
component, we would have to mold our 
requirements and design around the capa- 


bilities of the selected device. 
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by David Mann 
Multimedia ASVC Marketing, Integrated Silicon Systems 
into@iss-dsp.com 


The real-time manipulation of high-resolu- 
tion images (either moving-picture video 
or still-frame image streams) usually 
demands custom digital video processing 
in hardware. But why use a bunch of dif- 
ferent ASSPs for common video/image 
processing tasks-such as Color Space 
Conversion (RGB to YCrCb or vice versa), 
or Discrete Cosine Transform, when you 
can do it all in a Virtex, Virtex-E, or 
Spartan device? And the performance of 
these FPGA-optimized 


Specific Virtual Components” is very 


“Application- 


attractive. 


There are several standard video processing 
functions that are common to many vision 
systems; these systems include video broad- 
cast, machine vision, and image filtering 
applications. Now, thanks to Integrated 
Silicon Systems’ ASVC technology, the IP 
cores powering these applications can be 
implemented in FPGAs, and Xilinx can 
supply all of the vital links in your cus- 
tomized video compression/decompression 


chain (as shown in Figure 1). 
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Color Space Conversion 


Color Space Conversion (CSC) is one of 
the standard image processing techniques; 
its a trick that allows you to more-effi- 
ciently use the digital image data, associat- 
ed with a color pixel, by switching color 
domains. Processing an image in the Red- 
Green-Blue color space with a set of (R, G, 
B) values for each and every pixel really 
isnt very efficient. The RGB representa- 
tion has a significant downside: although 
its the natural paradigm for rendering full- 
color pictures using display technologies 
that emit mixtures of the three primary 
colors (such as CRTs, LCDs, LEDs, etc), it 
is not as efficient as special alternative 


schemes. 


The standard alternative representations 
use de-correlated components—luminance 
and chrominance. Thus CSC only comes 
into play whenever it’s time to present a 
picture to the human visual cortex, or after 
a real-world image is captured using a 
scanner or camera followed by processing 


in the digital domain. 
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New Color Space Converter LogiCOREs 

and a Combined Forward/Inverse DCT 
LogiCORE give you pre-designed blocks 
that solve difficult design problems. 


Color Space Converter LogiCOREs 


Xilinx presently offers a family of four dif- 
ferent Color Space Converter LogiCOREs, 


as shown in Table 1. 


Other useful pre-processing functions such 
as Gamma Correction are also incorporat- 
ed in these particular I[SS-designed 
LogiCOREs, so you spend less time devel- 
oping your CSC design using these solu- 


tions. 


Here’s just a few application areas for the 
Color Space 
LogiCOREs: 


Conversion family of 


¢ Video output conversion to digital RGB. 
e Image filtering. 

¢ Machine vision. 

¢ Video and still-image processing. 

DCT Engine LogiCORE 


Figure 1 shows where one of the 
RGB2YUV or RGB2YCrCb Color Space 
Converter LogiCOREs fits into a typical 


video/image processing flow. This example 
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Figure I - LogiCOREs in a typical digital video/image processing chain 


data path incorporates a Discrete Cosine 
Transform block—a necessary element in 


image compression algorithms. 


Now, there's a new LogiCORE which com- 
bines both Forward DCT and Inverse DCT 
functions in one, and it’s ISO/IEC 10918-1 
JPEG compliant. This high-performance 
DCT/iDCT engine offers 1-symbol/cycle 
processing power thanks to its fully pipelined 
architecture. The design is highly-tuned for 
optimal performance across the various 
Xilinx FPGA technologies. It requires only 
1756 slices in Virtex, 1759 in Virtex-E, or 
1728 in Spartan devices; and only 48 IOBs 


are needed for interfacing. 


This design is very efficient in the Xilinx 


LogiCORE RGB to YUV 


YUV to RGB 


architecture because the 2-D architecture 
uses row-column decomposition to separate 
the transform into two distinct 1-D opera- 
tions. Each operation generates a set of inter- 
mediate results that are written into trans- 
pose memory. Data is “burst” into the 
DCT/iDCT core as blocks of 64 values, and 
the results of the transform are presented in 


the same format. 


When in Forward DCT mode, this 
LogiCORE takes 8-bit input data words and 
produces an 11-bit output. In the Inverse 
mode, the converse is true. You've got 14-bit 
cosine coefficients, and a 15-bit representa- 
tion in transpose memory, so there’s no need 


to worry about precision. 


RGB to YCrCb YCrCh to RGB 
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System clock 

Virtex >75 MHz 
>80 MHz 
>100 MHz 


Spartan-ll 


Virtex-E 


Carry logic Carry logic Carry logic Carry logic 





>75 MHz 
>65 MHz 
>100 MHz 


>60 MHz 
>65 MHz 
>90 MHz 


>75 MHz 
>/0 MHz 
>90 MHz 


Availability 


Table 1 - LogiCORE specifications 
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Using the Combined Forward/Inverse DCT 
LogiCORE makes it very easy to create 
your own design, even if you dont have the 
engineering bandwidth or DCT expertise. 


And, the Xilinx software tools make it easy. 
Designing with LogiCOREs 


If you're familiar with HDL-based design 
and simulation, component instantiation, 
script-based logic synthesis, and the use of 
testbenches, then your'e all set to design 
using LogiCOREs. All the LogiCORE 
modules described here are available under 
a standard license agreement from Xilinx. 
You get the code and test vectors, together 
with installation and instantiation instruc- 


tions as part of the LogiCORE deliverables. 
Conclusion 


Digital video/image processing applica- 
tions can be very difficult to develop. 
However, the new Xilinx LogiCOREs offer 
feature-enhanced Color Space Conversion 
and Forward/Inverse Discrete Cosine 
Transforms that give you a time-to-market 


advantage. 


There are more LogiCOREs in develop- 
ment for digital video applications, includ- 
ing standalone M-JPEG Codec solutions 
for Virtex and Virtex-E. Talk to an ISS rep- 
resentative or your local Xilinx FAE about 


your particular application. 


To find out more, or to access the 
datasheets, visit the Xilinx dedicated IP 
Center at: www.xilinx.com/ipcenter. 
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Development Boards 


Stackable 
Development 


Boards for 
Spartan-ll, Virtex, 
and Virtex-E FPGAs 


A new series of prototyping 
boards to help you quickly 

test and implement your 
FPGA designs. 











by Dr. Stefan Schatroth 

Hardware and sottware development engineer, 
ErSt Electronic GmbH 
stefan.schatroth@erst.ch 


To help you increase your productivity 
and decrease your time to market, we 
recently designed a new set of develop- 
ment boards, using Spartan-II, Virtex, 
and Virtex-E FPGAs. Like our original 
Virtex-based board, the new boards pro- 
vide all the necessary basic components 
needed in most of FPGA-based designs. 
In addition, we incorporated an optional 
large ZBT RAM to satisfy the needs of 
modern telecommunication and imaging 
applications. All I/Os are routed to head- 
er connectors where you connect your 
special purpose interfaces. By stacking 
several boards you can easily cope with 
complex designs that exceed the scope of a 
single FPGA. The boards are fully com- 
patible with their predecessor such that 


Figure I - Functional diagram of the 


development board modules 


| New Products Mipeveeenenastric Gam Boards 


you can stack them together and reuse our 
power module (PWR3) as the supply for 
the various required supply and reference 


voltages. 
Key Features 


Each development board 


uses either a Spartan-II, 


Push buttons, DIP switches, and LEDs 
form a user interface that allows you to 
provide configuration data and monitor 
display status information from the run- 


ning system. 


Figures 2 and 3 show the top and bottom 
view of a development board module 


equipped with an XCVIOO0E FPGA. 
Applications 
The board is very well suited to: 


e Evaluate the larger members 


of the Spartan-II, Virtex, 


Virtex, or Virtex-E FPGA 
in a PQ-208 package 
(Spartan-II) or an HQ- 
240 package (Virtex, 


ee ood a er 
Fr Tre a = 


S SEErererraniritieeistrertie attests Saale | and Virtex-E FPGA families 
= be Taki) tere re in the PQ-208 or HQ-240 


eo oho 
packages, respectively. 
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e Experiment with different 


Virtex-E). Vital compo- 
nents for a basic system 
are placed around the 
FPGA, 
crystal oscillators, three 


push buttons, eight DIP 


including two 


switches, and nine status 
LEDs. An optional ZBT 
RAM helps to support 
any memory demanding 


applications. 


All configuration modes of 
the FPGA are supported, 
and you can provide con- 
figuration data either by 
using serial configuration 
PROMs (SCPs) sitting in 
onboard sockets, in-system 
programmable (ISP) PROMs, 
or by connecting a Xilinx 
MultiLINX, XChecker, or 
JTAG cable. The ISP 
PROMs and the FPGA 
form a single JTAG chain. 
A functional diagram 
detailing the building 
blocks of the prototyping 


boards is shown in Figure 1. 


The crystal oscillators are 

housed in standard DIL-8 or DIL-14 size 
metal cans plugged into sockets, so you can 
easily change the frequency. To facilitate the 
distribution of very fast clocks, we mounted 
four SMB coaxial connectors, next to the 
clock pins of the FPGA, which may be ter- 
minated with optional resistors to ground. 
The synchronous clock input of the ZBT 
RAM is also connected to one of these con- 
nectors. Alternatively you can use an FPGA- 


generated clock driven on an I/O pin. 


qeeeene 





Figure 3 - Bottom view of the development board module 


You can configure all eight I/O banks 
independently of each other, and you can 
select their VCCO and reference voltages 
individually with jumpers. Two different 
reference voltages (derived from the 
FPGA core voltage) can be generated 
onboard by means of trim potentiometers. 
Up to eight reference voltages can be con- 
nected from an external source, such as 


our PWR3 power module. 
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low voltage I/O standards. 


¢ Implement custom designs 
using the full power of the 


Virtex architecture. 
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¢ Test algorithms under real- 
time conditions and watch 
the signals with a logic ana- 


lyzer. 


° Quickly and easily expand 
the complexity of the sys- 


tem by stacking several 


boards. 
Conclusion 


The EVALXC2S, EVALXCV, 
and EVALXCVE_ develop- 
ment board series gives you an 
ideal platform for evaluating, 
implementing, testing, and 
extending custom designs 
using Spartan-II, Virtex, or 
Virtex-E devices. Using the 
optional ZBT RAM you can 
even implement applications 
calling for large amounts of 
memory. You can also easily 
integrate the board into a 
larger system. Like their pred- 
ecessor, the boards can be combined with 
the PWR3 power module to form a com- 
pact unit that runs from a single power 
supply. This makes it ideal for teaching, 


seminars, and courses. 


For additional information on 
EVALXC2S/XCV/XCVE see: 
www.erst.ch, or 
contact us at info@erst.ch. 
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Perspective 


one | by Jannis McReynolds 
. Business Development Manager 
Xilinx, Global Services Division 
< jannis.mcreynolds@xilinx.com 
A It’s a fact of life-the market waits for no 
one. Your new product idea wont be nearly 


as successful if your competitors get to mar- 








ket first. But success is not just about mov- 


ing quickly; design methodologies are get- 


ting more and more complex with each new 
advance in technology, so you must also 
move intelligently. 
| Xilinx Global Services is a portfolio of serv- 
> ices, and tools designed to keep you on the 


fast track. From technical support to educa- 


tion to design consulting, Xilinx Global 


“DESIGNERS FACE A MULTITUDE OF 
CHALLENGES IN THEIR EFFORTS 10 STAY AHEAD 
DF THE MARKET, SAID DAVE DJEMARINIS, 
BUSINESS (IEVELOPMENT MANAGER, GLOBAL 
SERVICES DIVISION. “WE DEVELOPED XILINX 
GLOBAL SERVICES 10 HELP THEM 
WORK FASTER AND SMARTER. 












| ly) (] kat Services compress your learning curve and 
J 7 accelerate your design time. The Xilinx 
Global Services portfolio consists of 


i Product Services, Education Services, 
, S 0 Design Services, and the highly acclaimed 


C50 J l ( Q : . support.xilinx.com. 


In tl 
make 
your fine 


It’s Okay To Cut In Line 


Our Platinum Technical Service, which 


costs $1,295.00 per seat per year, is a 





remote technical support service developed 
to help you reduce the time you spend trou- 
bleshooting and accelerate your design 
schedule—it gives you rapid access to our 
Senior Application Engineers. When you 
call or log-on to the website, your call auto- 


matically moves to the head of the queue. 
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Respective EE 


In addition, you receive eight Xilinx educa- 
tion credits you can use to further reduce 


design time. 


Of course, our free 
services already give 
you access to a 
wealth of technical 
support, — service 
packs, and software 
updates, but our 
Platinum Technical 
Service gives you 
top priority and is 
always available 
when you are: in 
North 
between 7 a.m. and 
5 p.m. Pacific 
Standard Time 
(PST), In Europe, 


service is available 9 


America 


a.m. to 5:30 p.m. 
Greenwich Mean 
Time (GMT). Xilinx Platinum Technical 
Service is a powerful tool to help you get 
your product to market before your com- 


petitors. 
Move to the Head of the Class 


Our Education Services are a flexible, com- 
prehensive collection of courses and venues 
targeted at keeping you and your designers’ 
technical skills sharp, so you get increased 
design productivity and therefore reduced 
time to market. 


We offer a broad range of classes at training 
centers worldwide or we'll bring the class- 
room to your place of business. We even have 
portable classrooms complete with comput- 
ers, software training materials, and instructor 
set-up, so all you have to do is show up ready 
to learn. 


If you're looking to expand your skills with 
the ultimate in convenience and cost savings, 
“Live” 


e-Learning is a virtual, real-time classroom 


check out Xilinx e-Learning. 
which allows you to interact with our instruc- 
tors and chat with other engineers over the 
Internet. Live e-Learning is also an excellent 
way to collaborate, particularly if your design 
team is spread out around the world, and we 


offer over 70 live e-Learning modules. 


“THE AILINX DESIGN SERVICES TEAMS BRING 
ALOT TO THE TABLE — UNIQUE EXPERTISE, 
EXPERIENCE WITH XILINX TOOLS AND 
DEVELOPMENT, AND THE FRUITS OF 
INVESTMENTS XILINX CONTINUOUSLY 
MAKES IN RO, SAID DAVE DEMARINIS, 
THE BUSINESS DEVELOPMENT MANAGER FOR 
GLOBAL SERVICES. “WITH OUR TEAM, YOU PAY 
FOR RESULTS, NOT FOR A CONSULTANTS TIME. 


Recorded e-Learning allows you to log-in 
from anywhere, listen to recorded educa- 
tion sessions and 
learn at your own 
pace. The e-Learning 
modules can _ be 
accessed anytime, day 
or night, and are less 
expensive than the 


live e-Learning classes. 


All of our Education 
Services provide 
excellent, hands-on 
training, suitable for 
all skill levels, from 
the novice to the 
expert. Classes are led 
by instructors who 
are themselves experi- 
enced designers. With 
our flexible and 
always available 

courses, you can 
choose the instructional method that keeps 
you on the technical cutting edge, without 


sacrificing your productivity. 


Stack the Deck in Your Favor 


Our Design Services give you the advan- 
tage of experienced engineers, who are ded- 
icated to your design. In the rush to mar- 
ket, it makes good business sense to take 
advantage of every resource you can, 
whether in-house or outside your company. 
Now you can outsource your design to an 
experienced Xilinx team, made up of best- 
in-class designers, silicon experts, and soft- 
ware specialists who become your virtual 


in-house design team. 


You can immediately extend your project 
bandwidth and eliminate ramp-up time by 
using Xilinx design Services. The benefit 1s 
that you're free to focus on tasks of the 
highest priority to you, and you can feel 
confident that your programmable logic 


design is in the most capable hands. 


Go Ahead... Ask Us Anything 


Our free suport.xilinx.com website is rec- 
ognized as the industry leader for online 
solutions to programmable logic design 
issues; it has received rave reviews from 


industry insiders. 


At support.xilinx.com you'll find: 


e The Answers Database, with over 4,000 


proven design solutions. 


¢ Problem Solvers, to troubleshoot device 
configuration, software installation, and 


JTAG issues. 


eA Web support interface, which allows 


you to open a Platinum priority case. 


e Discussion forums through which you 


can interact with other designers. 


As you might expect, support.xilinx.com is 
available seven days a week, 24 hours a day, 
365 days a year, so you can troubleshoot 
your design when it’s convenient for you— 
anytime, anywhere, support.xiinx.com has 


your answers. 


“WHEN COMPARED AGAINST THE 
COMPETITION IN CUSTOMER SURVEYS, 
XILINX (SUPPORT.XILINX.COM) OUTSCORED 
EVERY OTHER PLD) VENDOR'S WEBSITE IN 
CONTENT, ACCESSIBILITY, AND EASE OF USE, 
WROTE GARTNER GROUP IN THEIR 1349 
CORPORATE CUSTOMER SATISFACTION STUDY. 


Conclusion 


Xilinx Global Services can increase your 
productivity, reduce your design time, and 
improve your return on investment when 
designing programmable logic solutions. 
The Xilinx Global Services portfolio of 
education, product, and design services, 
along with the support.xilinx.com online 
resources will extend your technical capa- 


bilities and accelerate your time to market. 


For more information, contact your Xilinx 
sales representative. You can find the Xilinx 
sales office nearest you at: http://www.xil- 


inx.com/company/sales/offices.htm. 
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Get up to speed on the latest technologies quickly, conveniently, and cost effectively. 


by Renne Ricciardi 

Business Development Manager, 
Xilinx Educational Services 
renne.ricciardi@xilinx.com 


Despite enormous technical advances since 
the days of chalkboards and spiral note- 
books, traditional instructor-led classroom 
training is still the best way to learn when 
you need an in-depth understanding of a 
specialized topic. In the day-to-day race to 
market, however, time is a luxury few 


designers can afford. 
Online Learning, On the Spot 


With Xilinx e-Learning, you can choose 
from more than 70 online classes or mod- 
ules covering a broad range of topics and 
skills involving Xilinx products and servic- 


es. For example: 

¢ Introduction to FPGA Design. 
e Timing Constraints. 

¢ Spartan-II Architecture. 

¢ ModelSim XE. 

e Virtex-EM Architecture. 


Each module is an hour in length, and 
enrollment is quick and easy. Modules are 
taught weekly and presented at different 
times throughout the day to support 
Xilinx 


e-Learning wont interfere with your proj- 


worldwide access. Moreover, 


ect timeline, because there’s no lost pro- 


ductivity due to travel time. 
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We can help you determine which e-learn- 
ing modules are right for you through our 
brief self-assessment pre-tests. These tests 
help you gauge your knowledge and deter- 
mine what you need to know so you wont 


be spending money on training youve 


already had. 
Live e-Learning Environment 


Live instructors present classes and mod- 
ules in real time. During each session, you 
will have the opportunity to interact with 
the instructor, as well as collaborate with 
online subject experts. You may pose ques- 
tions to the instructor, view slides, share 
whiteboards, and discuss issues with other 
students in chat rooms. Pop quizzes 
appear periodically throughout the ses- 
sion—and you get instant feedback. 


Participating in an e-learning session is 
simple. You access the e-learning module 
using your Web browser and a phone con- 
nection. No additional software is 
required. Ten minutes before the class, 
you call into the conference, log on to the 
URL, download training documents, and 


youre ready to go. 


Xilinx e-Learning classes are Open to 
everyone, and each e-learning module 


costs $100 per session. 
Customized e-Learning 


If you have multiple designers who want 


to take the same course at the same time, 


or if you prefer a date or time that is more 
convenient for you, simply call us and we 
will schedule an instructor to deliver a 
module of your choice for your group 
only. The only requirement is that you 
have a minimum of six people who want 
to attend the session. The maximum num- 


ber of people is 100. 


A private session gives you more control 
over the pace of delivery. Just ask the 
instructor to speed up or slow down. In 
addition, the questions and discussion can 
be focused on issues and ideas important 


to you. 


If you have designers located in remote 
locations, or if you have designers who are 
interested in different modules, call us and 
ask about a Bundle Package Program. 
With a bundle purchase, your people can 
complete modules when it’s most conven- 
ient for them. Any of the designers can 
sign in to attend a session at any time 
throughout the year, or until you have used 
up all the modules you purchased. 


Conclusion 


Xilinx e-Learning is the most cost-effective 
solution to help you keep your technical skills 
sharp and up-to-date. To learn more about 
Xilinx e-Learning, visit the Xilinx e-Learning 
website at www.support.xilinx.com/support/ 
education-home.htm or call the registrar at 


877-999-2527. 






by Autumn Conrad 
Public Relations, Xilinx, Inc. 
qutumn.conrad@xilinx.com 


As leaders in the global semiconductor 
market, we take pride in sharing our suc- 
cess with the communities in which we do 
business. Since the founding of our com- 
pany in 1984, Xilinx has maintained close 
relationships with several community part- 
ners by establishing and supporting inno- 
vative programs in the areas of education, 
health, and welfare. 


2000 Outstanding Corporate Grantmaker 


This tradition of community involvement 
was recently recognized by the National 
Society of Fundraising Executives 
(NSFRE) who honored Xilinx as the “2000 
Outstanding Corporate Grantmaker” of 
the year. With this award, Xilinx joins past 
recipients like Hewlett Packard, Aspect 
Communications, Applied Materials, Sun 
Microsystems, AMD, Therma, and Apple 
Computer as philanthropic leaders dedicat- 


ed to building strong communities. 


This prestigious award is given to corpora- 
tions that demonstrate a commitment to 
the community through financial support 
and exceptional employee involvement. 
Xilinx was nominated by five community 
organizations and educational institutions 
that have maintained a close relationship 
with us over the past 16 years. It is through 
such partnerships that we can truly make a 
difference in our communities and the lives 


of many. 


Walk for Aids 
The Santa Clara County Walk For AIDS 


nominated Xilinx for our active participa- 
tion and support of the event over the past 


nine years. Once again, we hosted the 


annual kick-off rally and urged other cor- 


porations in Silicon Valley to get rntolvede F catea* “Sto ant Students” whic 


ages employees to donate one hive: hi my 


through increased corporate sponsorship 
and participation. In Santa Clara County, 
the Walk For AIDS serves as the largest 
fundraising event for nine organizations 
that provide services and prevention edu- 
cation to individuals living with HIV and 
AIDS. Agencies like Health-Connections, 
a community service of The Health Trust 
which also nominated us for the award, 
provide critical health and social services to 
individuals and their families affected by 
HIV/AIDS. 


Children’s Shelter of Santa 
Clara County 


Xilinx employees remain 
active year round by vol- 
unteering at organiza- 
tions such as_ the 
Children’s Shelter of 
Santa Clara County. Our 
relationship with the 
Children’s Shelter started 
when we helped con- 
tribute to the $13.7 mil- 
lion building fund, 
which is located adjacent to Xilinx head- 
quarters in San Jose. In November of 
1995, the Children’s Shelter opened its 
doors to provide shelter, 24-hour care, and 
services to abused and neglected children. 
Since that time, the Children’s Shelter has 
provided care to over 15,000 children. 


Stock for Students 


As part of our strong commitment to edu- 
cation, we support innovative, results-ori- 
entated programs that provide students 
with the skills necessary to succeed in 


tomorrow's technological world. We 





Partnerships 
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personal stock to Xilinx-adopted schools. 
Through funds raised by the program, 
Oster Elementary School will build the 
“Xilinx Science Laboratory” to provide stu- 
dents with hands-on learning in the life, 
earth, and physical sciences that would 


otherwise be unavailable. 
Faculty Endowment Fund 


In higher education, the Department of 
Electrical Engineering at San 
Jose State University nominated 
us for establishing a faculty 
endowment fund for the uni- 
versity. Due to the high cost of 
living in Silicon Valley, the uni- 
versity was having difficulty 
attracting and retaining faculty. 
The faculty endowment fund 
enables the university to aug- 
ment the salaries for three qual- 
ified professors annually. In 
addition, we recently funded 
the establishment of the “Xilinx 
Digital Laboratory” which will 
help prepare students to meet the chal- 
lenges they will face once they graduate. 


Conclusion 


Through our partnerships, we have come 
together to explore ways to build commu- 
nities that reflect the success and generos- 
ity of our times. We are proud of our 
partnerships with these organizations and 
take honor in supporting their efforts to 
make our community a better place for all 
people to live, work, and prosper. It is 
only by working together that we can 


truly make a difference. 
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Sottware Solutions 


Version 3 Development Systems 
Quick Reference Guide 


Xilinx development systems give you the speed you need. With the initial release of our 
version 3 solutions, Xilinx place and route times are as fast as two minutes for our 
200,000 gate, XC2S200 Spartan™-II device, and 30 minutes for our one million gate, 
system-level XCV1000E Virtex™-E device. That makes Xilinx developmen systems the 
fastest in the industry. 


And with the push of a button, our timing-driven tools are creating designs that sup- 
port I/O speeds in excess of 800 Mbps, and internal clock frequencies in excess of 300 
MHz. With each quarterly release, we are further accelerating your design process. 


Xilinx desktop design solutions combine powerful technology with an easy to use inter- 
face to help you achieve the best possible designs within your project schedule, regard- 
less of your experience level. For more information on any Xilinx products, visit 
www.xilinx.com 


Alliance Series Solutions: 

The Alliance Series Solutions contain powerful open systems implemen- 
tation tools that are engineered to plug and play your existing design 
flow. This combination of advanced features delivers high performance 
results on the toughest designs. 





Foundation Series ISE Solutions: 

Foundation Integrated Synthesis Environment (ISE) is Xilinx next gen- 
eration design environment, optimized to deliver the benefits of an 
HDL methodology. Foundation ISE is packed with technologies that 
help you bring your product to market faster. 





Foundation Series Solutions: 

The Foundation Series solutions are complete, ready-to-use design envi- 
ronments for programmable logic design based on industry-standard 
schematic, HDL, and pushbutton design flows. 





Xilinx Web-based Design Solutions provide designers the ability to engage in digital 
design activities, on-line, using Xilinx application servers, or download design and 
implementation software modules for use in their own design environment. These 
applications include: 


he WebFITTER: 
; ‘Webl|TTER The WebFITTER is a free Web-based design tool that allows system 
: ’ designers to evaluate their designs using Xilinx XC9500 Series 
CPLDs: 


WebPACK: 
The WebPACK is a collection of four free downloadable software 
modules including ABEL v7.1, VHDL and Verilog synthesis, 


design implementation tools, and device programming software. 





WebPACK now includes support of the entire Spartan-I] FPGA 
family as well as the 300,000 system gate Virtex XCV300EFPGA. 
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Reference | Software 


Base and Base Express Configurations 

The Base and Base Express configurations 
provide push button design flows and sup- 
port a broad array of FPGA and CPLD 
devices targeted for low density and high 
volume applications. 


Standard and Express Configurations 

The Standard and Express configurations 
combine push button flows with powerful 
auto-interactive tools. These tools give 
designers more influence and control over 
implementation while maintaining the ben- 
efits of design automation. 


Elite Configurations 

The Elite configurations are designed to 
support powerful design flows that deliver 
high-performance designs for even the high- 
est density, multi-million gate FPGA devices 
from Xilinx. 


WebFITTER URL: 
Go to the Xilinx website 
http://www.xilinx.com and jump to 


"WebFITTER" 


WebPACK URL: 


Go to the Xilinx website 


http://www.xilinx.com and jump to 


"WebPACK" 





Software 





Version 3 Development Systems 


Feature Comparison Guide 
Design SE Alliance Foundation Foundation ISE WebPACK 


Schema 
VHDL, Veriolog HDL, ABEL, HDL l=) 
State Diagram Editor (| 





Floorplanner 


CORE Generator 
TiningConstois CST 
Modular Design (Optional) ieee (Optional) ae 


Design Synthesis Alliance Foundation Foundation ISE WebPACK 


ale Verification Alliance Foundation Foundation ISE WebPACK 


ee cee 
ee 
MOLTestbench Generator | 
Integrated Logic Analysis (ChipScope LA) | (Optional) | (Optional) | (Optional | 
iSateTiningAnaysis ee 
Design Implementation Alliance Foundation Foundation ISE WebPACK 
Constraints Editor =| 
rcPubchipviewerSSSSC*=~“‘*‘*dSC“‘CNSOW™#*SOC‘sRSC*~‘“SC“‘®CCWSCOS™*~‘’SNSNSNsS 
a 
ae 
“Commandtine Operation || 

a 
DataBookvOTming SCC‘ SSSCdYSC 

Se a a aS ae a ae 
[Muti-passPlaceandRoute || 
Priest Archiving ee 
System Interfaces Alliance Foundation Foundation ISE WebPACK 
a TT 
PROMFleGeneratorSSSCS~=~‘“SSC“‘(CW”~S™~O*iYSOC* OCT 
i 
a 
a 
—<—<———$— 
PHDL Simulation bras 


Environment Alliance Foundation Foundation ISE WebPACK 


Operating System PC / UNIX PC PC 


Device Comparison Guide 
Standard/Express Base/Base Express el we aS) 3 


All Virtex-II Family Virtex-I] Family up to XC2V1000 Virtex-I] Family up to XC2V80 Virtex XCV300E only 

All Virtex-E Family Virtex-E Family up to XCV1000E Virtex-E XCV50E only All Spartan-Il Family 

All Virtex Family All Virtex Family Virtex XCV50 only All CoolRunner Series'4) 
All Spartan Series All Spartan Series All Spartan Series All XC9500 Series 


All XC9500 Series All XC9500 Series All XC9500 Series 

All XC4000E/L/EX All XC4000E/L All XC4000E/L 

All XC4000XL/XLA All XC4000XL/XLA/EX/XV'8) XC4000XL/XLA up to XC4020 
All XC3000'3) All XC3000!3) All XC30003 

All XC5200'3) All XC5200') All XC52003 





1. Evaluation functionality available through the Xilinx ALLSTAR program. For more information on the ALLSTAR program, go to www.xilinx.com. 
2. Functional and timing simulation is performed using a HDL simulator in the ISE product. 

3. XC3000, XC5200, and XC4000XV devices are not supported in the Foundation Series ISE configurations. 

4. CoolRunner series is only available in WebFITTER and WebPACK at this time. 

5. Foundation Base does not include a license for FPGA Express. 
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Each Virtex family has its own unique fea- 
tures to meet different application require- 
ments. All devices have both distributed 
RAM and block RAM, and between four 
and eight DLLs for efficient clock manage- 


ment. 


¢The Virtex family, consisting of devices 


that range from 50K up to 1 million logic 


Reference | Virtex 


Virtex and XC4000X Series FPGAs 


The XC4000X Series is part of the broad 
spectrum of Xilinx “XL” products unveiled 
September, 1998. As a result, Xilinx offers 
the broadest choice of 3.3V and 2.5V devices 
available from a single supplier, with densi- 
ties ranging from 800 to 500,000 system 
gates. With 12 family members ranging from 
30,000 to 500,000 system gates, the devices 


port, or dual-port memory. Designed in an 
advanced 0.25 micron process, the 
XC4000X series delivers industry-leading 
performance while significantly reducing 


power consumption. 


See www.xilinx.com 
for more information 


gates, supports 17 I/O standards, and offers feature patented SelectRAM memory, with a 


5V PCI compliance. highly flexible arrangement of logic, single- 


¢ The Virtex-E family 


offers the highest logic 
FPGA Product Selection Matrix 


gate count available for 


any FPGA, ranging FEATURES 





from 50K up to 3.2 e 8 a FE © 
“—We Ze) — 
million system gates, 3 E & A= = = Alo/a 
os oc ue =.= 

and supports 20 I/O 2 e2 Se > = | oe ~ | as 

Ss | £8 Ss = se |e ee ee 

standards including DEVICES KEY FEATURES — =a ro = oO a LL 2 of 


XC4013XLA 


He 


3K |  10K-30K 
1862 | 20K | 13K-40K | 25K 


NO 
> 
x< 
RO 
H 
O1 
lop) 
a 
O1 
oO 
oO) 


LVPECL, LVDS, and 
Bus LVDS differential 


XC4020XLA 


NO 
lee) 
x< 

NO 
[oe) 
~ 
co 
- 
NO 
(=) 
—2 
(o>) 
NO 
NO 
- 


—, 
oO 
fo?) 
(oe) 








o1 
o1 
lop) 
® 
~< 


ae XC4028XLA XC4000 Series: | 2432 | 28K 33K | 32x32 | 1024 | 2560 | 256 
gnaling xcwosmta | O88 sr 6 
° The Virtex-EM XC4044XLA | High Performance/ 40x40 | 1600 | 3840 | 320 | 
Extended Memory XC4052XLA SelectRAM | 4598 | 52K 62K | 44x44 | 1936 | 4576 | 352 : 
family consists of two meee meet pee [an | arr | 7K 48x48 
amy consis OF KA sn 38 
vane e XCV50 21K 56K | 16x24 | 384 ICE 
5 aaah a XCV100 Pe oieeTs 72K-109K | 78K | 20x30 | 600 | 2400 | 180 | [| 
gate ratio that is target aig Mex Family: | 3888 | 47K | 93K-165K | 102K | 24x36 ee 
ed for specific applica- — Fgyan | pats, [so0a | ak | 148k 237K | 10K | 2B a 
tions such as gigabit XCV300 Leadership | 6912 | 83K | 176K-323K | 160K | 32x48 | 1536 | 6144 | 316 a prec V0| * | 
persecond nervork [xe], Bet ia Scent (erat erat 
switches and high defi- _ |_XCV600 Select | 15552 | 187K | 365K-661K | 312K | 48x72 | 3456 |13824| 512 | 2/24 | Y x |1/0| * 
nition graphics. XCV800 sous | 21n6e | 254k | s11k-8eek | ao6k | s6xee | a7oa |raet6 | 512 | 2/24 | y || - |X| * 
XCV1000 area | 332K | 622K-1,124K | 512K | 64x96 | 6144 [oas76 [512 | 224 |v || - |X) > 
XCVS0E 1728 | 21K | 47K-72K__| 88K | 16x24 | 384 | 1536 | 176 | 2/24 | ¥ | x |vO|VO| ** | 
XCV100E 2700 | 32K | 105K-128K | 118K | 20x30 | 600 | 2400} 196 | 2/24 | Y | x |/O| VO) **| 
XCV200E Virtex-EFamily: | §292 | 64K | 215K-306K | 186K | 28x42 | 1176 
XCV300E Pet ce | Sat2 | 83k | 254K-412K | 224K | 32x48 | 1536 | 6144 | 316 | 2/24 | ¥ | x |yo| vol *| 
XCVA00E Leadership [10800 | 130K | 413K-570K | 310K | 40x60 | 2400 | 9600 | 404 | 2/24 | v | x |vo| yo] **| 
XCV600E BlockRAM —-| 15552 | 187K | 679K-986K | 504K | 48x72 | 3456 |13824 | 512 | 2/24 /0 
XCV1000E ase Se 64x96 | 6144 |24576 | 660 | 2/24 0 | VO] ** 
XCV1600E 8 DLLs eas | 20k, 1670-21 98K fomek72st0e 7776 fotos | 724224 4 X10 0 
XCV2000E LVDS, BLVDS, | 43200 | 518K | 1,857K-2,542K |1240K |8 9600 |38400 | 804 | 2/24 1/0 
XCV2600E LVPECL 57132 | 686K | 2,221K-3,264K |1530K |92x138 |12696 |50784 | g04 | 2/24 1/0 
rat | [nea oe 64896 
oa0o 130 nso | 2400] 600 | oH | 2 
56x84 


XCV405E Virtex Extended | 10800 | 130K | 1,068K-1,207K 40x60 960 
XCV812E Memory Capabilities | 91168 | 254K | 2.569K-3,062K |1414K | 56x84 | 4704 |18816 
* 1/Os are 5V tolerant 
** 5 Volt tolerant I/Os with external resistor 
X = Core and |/0 voltage 

62 I/Os = 1/0 voltage supported 


24.| Y | X |/0| 1/0] * 





Spartan 


Robust Feature Set Advantages Over ASICs 


¢ Flexible on-chip distributed and ¢ No costly NRE charges. 


block memory. 





¢ No time consuming vector generation 


¢ Four digital Delay Locked Loops for needed. 

Say hello to a new level of performance; the efficient chip-level/board-level clock | dev; ag 
Spartan-II family now includes devices management. ° All devices are 100% tested by Xilinx. 
with over 200,000 system gates. You get ¢ Field upgradeable (remotely upgrade- 
100,000 system gates for under $10, at : Select VO Technology for interfacing able, using Xilinx Online technology). 
speeds of 200 MHz and beyond, giving with all major bus standards such as | 
you design flexibility that’s hard to beat. HSTL, GIL, SSTL, and so on. “a lengthy prototype es Producven 
These low-powered, 2.5V devices feature ¢ Full PCI compliance. Aa TIDaes: 

I/Os that operate at up to 3.3V with full ¢ Priced aggressively against comparable 


e System speeds over 200 MHz. ASICS 


5V tolerance. Spartan-II devices also fea- 
ture multiple Delay Locked Loops, on-chip ¢ Power management. 
RAM (block and distributed), and versatile 
I/O technology that supports over 16 high- 
performance interface standards. You get ¢ Complete suite of design tools. www.xilinx.com/products/spartan2 
all this in an FPGA that offers unlimited 
programmability, and can even be upgrad- 


ed in the field, remotely, over any network. ¢ Compile designs in minutes. 


Extensive Desiqn Support : ; 
g PP For more information See 


¢ Extensive core support. 





FPGA Product Selection Matrix 





DENSITY FEATURES 








CLB Matrix 
CLBs 
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Logic Gates 
Gate Range 
Max. RAM Bits 
Output Drive 


n 
a 
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Maximum 


Logic Cells 


DEVICES KEY FEATURES 


XCS05 Spartan Family: 10x10 100 | 
XCS10 High Volume =| 466. 14x14] 196 | 
XC$20 mater | 950} 10K 13K} 20x20 | 400. 
XCS30 High Performance/ 13K 18K | 24x24 15 
XCS40 SelectRAM Memory 20K 25K | 28x28 
XCSO05XL Spartan-XL Family: 3 
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100 

XCS10XL Pune | 466] 5K} — 3K-10K 6K} 14x14 | 196] 616] 112 | 12/24 

xes2x | pel SM gny [950 tox | 7K-20K | 19K] 20x20 | 400) 1120) 160 
XCS40XL SelectRAM Memory 784 | 2016) 224 | 12/24 : 
xC2S15 [aaa] ak) eK-15K | 22K] ai 96| 30a) 06 | 212 
xcosa9 | SPartan Fai iaxi8 228 
XC2$50 a 56K | 16x24 | 384| 1536] 176 | 2/24 0) * 
XC2S100 Distributed RAM | 2700| 53K | 37K-00K | 78K | 20x30 | 600 | 2400) 196_ yo) * 


NO | BO NO |} NO 
> | & | BS 
<|<|<|<|</<|<|<|<|<|<|<|<|~<|<|<| pci compliant 


=< 
© | RO 
>} 
O1 | © 
oS | © 
== 
= 
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Selectl/0 3888| 77K | 52K-150K | 102K| 24x3 


4 DLLs 


2ex42_| 1,176 | 470 


oD 


XC2S$150 
XC2S8200 


*  1/Os are tolerant 
X = Core and I/O voltage 
I/Os = 1/0 voltage supported 
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Reference ie ae 





Whether performing high-speed _net- 
working or power-conscious portable 
designs, Xilinx CPLDs provide you 
with a complete range of value oriented 


products. 


X€9500 - Offers industry-leading 
speeds, while giving you the flexibility 
of an enhanced customer-proven pin- 
locking architecture along with exten- 
IEEE Std. 1149.1 JTAG 
Boundary-Scan suppott. 


sive 


CoolRunner — Offers the patented Fast 
Zero Power (FZP™) design technology, 
combining low power 
and high speed. These 
devices offer standby 
currents of less than 
100 microamps, oper- 
ating currents 50-67% 
lower than traditional 
CPLDs, and pin-to- 
pin speeds of 5.0 ns. 





2 


3.3 Volt 
ISP 


5 Volt 





se | ae ole ane 
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CPLD Product Selection Matrix 


SVOLT | yegsooxy 


XC9500XL 


ib] et ble _ 


WebPOWERED Software Solutions — Offer 
you the flexibility to target Xilinx CPLD 
and FPGA products on-line or on the 
desktop, including: 


e WebFITTER — an on-line device fitting 
and evaluation tool that accepts HDL, 
ABEL, or netlist files and provides all 
reports, simulation models, and pro- 
gramming files, along with price quotes. 
Available to support all Xilinx CPLD 
products. 


Density 








Key Features 
Macrocells 






= 





XC9536XV_| Best Pin-Locking 
XC9572XV JTAG w/Clamp 
XC95144XV High Performance 


xC9528aXV High Endurance 


XC9536XL__ | Best Pin-Locking 
XC9572XL JTAG w/Clamp 
xC95144X__ |High Performance 


XC95288XL High Endurance 


XCR3032XL 
XCR3064XL 
XCR3072XL 
XCR3128XL 
XCR3256XL 
XCR3384XL 
XC9536 
AC9572 
XC95108 
XC95144 
XC95216 
XC95288 


288 






: 
«| 
512 
rs | 8 
P56 | 68 
108 | 108 


Ultra Low Power 
JTAG 
Increased Logic 
Flexibility 


XPLA3 





Best Pin-Locking 
JTAG 
High Endurance 


XC9500 
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XC9500 and CoolRunner CPLDs 


© WebPACK ISE — downloadable desktop 
solutions that offer free CPLD and 
FPGA software modules for 
ABEL/HDL synthesis and simulation, 
device fitting, and JTAG programming. 


Through leading performance, free inter- 
net-based WebPOWERED software, and 
the industry's lowest power consumption, 


Xilinx has the right CPLD and FPGA for 


every designer's need. 


For more information about Xilinx CPLD 
products, see: 
www.xilinx.com/xInx/xil_product_ 


landingpage.jsp 





Features 












Pin-to-Pin 
Delay (ns) 
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OE Ctrl 
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XC16V 


FPGA Configurations 


XC1/V 
XC1/S 


Xilinx offers a full range of configu- 
ration memory devices optimized for 
use with Xilinx FPGAs. Our PROM 
product lines are designed to meet 
the same stringent demands as our 
high-performance FPGAs, taking 
full advantage of the same advanced 
processing technologies. In addition, 
they were developed in close cooper- 
ation with Xilinx FPGA designers 
for optimal performance and relia- 


bility. 
XC18V00 - Our in-system repro- 


grammable family provides a feature- 
rich, fast configuration solution 
available today, and provides a cost- 
effective method for reprogramming 
and storing large Xilinx FPGA bit- 
streams. This family is JTAG ready 
and Boundry-Scan enabled for 
exceptional ease-of-use, system inte- 


gration, and flexibility. 


XC17V00/XC17S00 -— Out low-cost 
XC17V and XC17S families are an 
ideal configuration solution for cost- 
XC17V 
PROMs are pin-compatible with our 
XC18V family to allow for a cost- 


sensitive applications. 


reduction migration path as your 
production volumes increase. The 
XC17S family is specially designed 
to provide a low-cost, integrated 
solution for our Spartan families of 


FPGAs. 
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Configuration PROMs for Virtex-E/Virtex-EM 

XC17xx XC18Vxx 8-pin 
Bits Solution Solution TSOP 

630,048 17V01 18V01 X 


863,840 | 17V01 | 18VO1 |X 
1,442,106 | 17V02 | 1802, | X* 
1,875,648 | 17V02_ | 18V02_— 
2,693,440 | 17V04 | 1804 


Configuration 
Device 


XCV50E 
XCV100E 
XCV200E 
XCV300E 
XCV400E 
XCV405E 
XCV600E 
XCV812E 
XCV1000E 
XCV1600E 
XCV2000E 


am 
a 
3,430,400 | 17v04 | 18V04_ | 
| 3,961,632 | 17v04 | 18v04 | 
| 6,519,648 | 17V08 |  2of18vod | — 
6,587,520 | 17V08 |  2of18vod | — 
| 8,308,992 | 17v08 |  2of18voa | — 
10,159,648 | 17v16 | of 18v04 | 
XCV2600E 3 of 18V04+18V512 | — | 


XCV3200E 16,283,712 17V16 4 of 18V04 = 


* — Available in XC17Vxx only. 
** Available in XC18Vxx only. 
*** Available in XC18V512 only. 





Configuration PROMs for Virtex 


XC17xx XC18Vxx 


Solution Solution TSOP 
17V01 18V01 X* 
ior | 1801] XFL 
vor | vol |X 
| i7vor_ | isvo2| CX 
_i7vo2 | —18V02— ws] 
17vo4_ | 1804 | 

| = 
17v08_| _18V04+18V512_ | - 


17V08 18V04 + 18V02 — 


Configuration 
Bits 

559,200 
781,216 
1,041,096 
1,335,840 
1,751,808 
2,546,048 
3,607,968 
XCV800 4,715,616 
XCV1000 6,127,744 

* — Available in XC17Vxx only. 


** Available in XC18Vxx only. 
*** Available in XC18V512 only. 


Device 

XCV50 

XCV100 
XCV150 
XCV200 
XCV300 
XCV400 
XCV600 





Configuration PROMs for Spartan-XL/Spartan-ll 
PROM 8-pin | 8-pin | 20-pin 
Solution PDIP C |} SOIC | VOFP 
XC17S05XL — - 
XC17S10XL 
XC17S20XL 
XC17S30XL 
XC17S40XL 
XC17S15A 
XC17830A 
XC17S50A 
XC17S100 
XC17S150A 
XC17S200A 


Device 
XCS05XL 


_XCS10XL | 
_XCS20XL__| 
_XCS30XL_| 
_XCS40XL_| 
XC2S15 
_XC2S30__ 
_XC2S50_ 
_XC2S100 
_XC2S150 | 
_XC28200 


x 


> 


<0 
>< | >< | >< | D< | O< | O< | ><] <<] <<] XK OOS 


=< 


ia 


20-pin|20-pin |44-pin | 44-pin 
PLCC | SOIC | PLCC | VOFP 


><) >< 
*| x 
*| xk 


E 
* 
* 
~< 


20-pin}20-pin |44-pin | 44-pin 
PLCC | SOIC | PLCC | VOFP 


ESes tesa 
eee. 
ea Pea pew 
EZ ESS 
ee ee ee 
Es 
aa 


~< 
* 
* 


% 
* 
* 


~< 
>< | >< 
*| x 
*| ok 


~< 


* 
Xa eee 
KKK KX 
X X 


- X 


=< 


=< 


65 





PROMs 
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The Xilinx QPro family of Radiation 
Hardened FPGAs and PROMs are finding 
homes in many new satellite and space appli- 
cations. Both the XQR4000XL and XQVR 
Virtex products are being designed into space 
systems that will utilize reconfigurable tech- 
nology. Numerous communications and 
GPS satellites, space probe, and shuttle mis- 
sions are included on the growing list of pro- 
grams that will be flying these devices. 


FPGA Product Selection Matrix 


Logic Cells 


Device Key Features 


**X OR/X04013XL XC4000 Series: 
Density 


= KOR) AG Ie ewe enon ce: 
X04085XL Memory 


X0V100 Virtex Family: 


Density/ 
**X OVR/X0OV300 Performance 
**X OVR/XOV600 


Leadership 
BlockRAM 
**XQVR/XOV1000 


Distributed RAM 
Selectl/O 4 DLLs 


* |/Os are tolerant 

** XQR and XQVR devices are Radiation Hardened 
X = Core and I/O voltage 

I/Os = 1/0 voltage supported 


(1) Selected XQ4000E/EX devices also available 
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27,648 | 332K | 622K-1,124K | 512K 


The Virtex QPro family of High Reliability 
products is experiencing a high degree of 
success in the defense market. As designers 
find it more and more difficult to find 
components suitable for the harsh environ- 
ments seen by defense systems, they are 
discovering that they can incorporate the 
functions of obsolete parts into Virtex 
QPro products. This has the added long 
term advantage of significantly reducing 
the costs of future re-qualifications, 


Typical System 


Gate Range 
Max. RAM Bits 


CLB Matrix 


Logic Gates 


Maximum 
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~~ QML-Certified FPGAs and PROMs 


because their systems can retain consistent 
form, fit, and function through the use of 
Virtex QPro FPGAs. This cannot be 
achieved with costly and inflexible ASICs 


or custom logic. 


Please visit http://www.xilinx.com/prod- 
ucts/hirel_qml.htm for all the latest infor- 
mation about these products, including 
some new applications notes. 


FEATURES 


2.5 Volt 
3 Volt 


BERR Rees 


Flip-Flops 
Output Drive 


>< [><] > >< 
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QPRO OML-certified PROMs 











Device 


XC17128D 
XC17256D 
XQOR/X0.1701L* 





Density $020 
_1Mb_| 
-XQR/XO18V04* | 4Mb _ 





Package 











* XQR devices are Radiation Hardened. 


** XQ devices only. 





The Most Comprehensive and Highest 
Quality Solution in the PLD Industry 
The Xilinx Intellectual Property Solutions 


Division offers the best 
Intellectual Property solutions for a wide 


selection of 


variety of industries and applications. Xilinx 
Smart-IP Technology delivers high perform- 
ance, flexibility, and predictability, with opti- 
mized cores that give you both reduced cost 
and faster time to market. 


LogiCORE™ Products — Licensed and sup- 
ported by Xilinx, LogiCORE products such 
as parameterizable DSP building blocks and 
memory cores are included with the Xilinx 
CORE Generator software which is a com- 
ponent of your Xilinx Foundation Series or 
Alliance Series software. 


AllianceCORE™ Products — A cooperative pro- 
gram with third-party IP suppliers who sell 
and support their cores directly with Xilinx 
customers. AllianceCORE products must 
meet criteria that ensure they deliver value 
and performance in a Xilinx device. 


Reference Design Alliance Program - Xilinx 
proactively supports development of third- 
party system-level Reference Designs to pro- 
vide fully functional, modular designs, that 
offer considerable development time savings. 


XPERTS Program — The worldwide XPERTS 
Program provides over 70 consultants certi- 
fied in delivering turnkey system designs for 
the Xilinx architecture, including PCI 
designs, new design methodologies, system- 
level design, along with IP customization 
and integration. 


IP Delivery Tools —- The Xilinx CORE 
Generator™ enables cataloging and genera- 
tion of parameterized cores that are high per- 
formance, predictable, and integrated with 
our system-level design reuse tools; the cores 
are provided in VHDL and Verilog behav- 


ioral description languages. 


The IP Center Internet portal, provides access 
to the latest LogiCORE and AllianceCORE 


products and reference designs via Smart 


Search; you can easily find the IP that you 
need at www.xilinx.com/ipcenter. Advanced 
function cores are available for IP evaluation 
and can be purchased via the IP Center. 


Design Reuse — Download the "Reuse Field 
Guide Methodology for FPGA and ASIC 
Designs." Then use the Xilinx IP Capture 
Tool to package your IP with simulation 
models, testbenches, and PDF or HTML 
files. Then, you can catalog and share your IP 
using the CORE Generator. 


The REAL PCI 64/66 — Parameterizable PCI 
cores, reference designs, prototyping boards, 
education, and Xilinx PCI XPERTS com- 
bined with a proven design and guaranteed 
timing make Xilinx PCI the lowest risk solu- 
tion in the market 


The Xilinx DSP Solution — Our exclusive FPGA 
partnership with MathWorks enables you to 
create complex, high performance DSP 
designs in a familiar environment with huge 
time to market advantages. Xilinx and its 
partners offer a complete set of cores for 
high-performance low-cost DSP implemen- 
tation that provide: 


e Xtreme Flexibility — Distributed DSP 
resources (such as look up tables, registers, 
multipliers, memory) and segmented rout- 
ing allow optimized implementation of 


Reference 


treme(S[” Xilinx Intellectual Property Solutions 


algorithms. Plus you get all the traditional 
FPGA benefits: 


- RAM-based FPGA technology, for fast 


and easy design changes 


- Fast time to market, to give you a com- 
petitive advantage 


- Field upgradeable systems (using 
IRL™), for extended product lifecycle 


e Xtreme Productivity — The industry's first 
System Generator for Simulink® bridges 
the gap between FPGA and conventional 
DSP design flows, and features: 


Filter 
Generator, for performance/cost opti- 


constraint-driven 


- Unique 


mization 


- Power estimator tool (Xpower™), for 
very low-power DSP implementations 


- Eleven optimized DSP algorithms 
(cores) that cut development time by 
weeks 


- New DSP features added to the 
ChipScope ILA tool, rapidly reduces 
hardware debugging time 


e Xtreme Performance - Table 1 illustrates 
the amazing performance you can achieve 


with Xilinx DSP. 


Table 1 - Extreme Performance 


Function 

MACs per second 

- Multiply and accumulate 

-8x8-bit 

FIR Filter 

- 256-tap, linear phase 

- 16-bit data/coefficients 

FFT 

- 1024 point, complex data 

- 16-bit real and 
Imaginary comp. 


@ 800 MHz 





Industry's 


Fastest DSP 
Processor Core | Virtex-E -08 


4.4 Billion 128 Billion | 600 Billion 
17 MSPS 


@ 1.1 GHz 


160 MSPS 
@ 160 MHz 


180 MSPS 
@ 180 MHz 
7.] us 41 us 

@ 100 MHz 


<1 us 
@ 140 MHz 


6/7 
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Introducing the Xilinx 3.2i software release . . . the fastest in the industry 
Watch your designs go supersonic. With the new Xilinx 3.2i release, you can 
place and route your next 100,000-gate design using a Spartan® FPGA In just 
ww 


one minute, or your next one-million-gate design using 


a Virtex™-E FPGA in only thirty minutes. 


Faster tools improve your time-to-market advantage 


With Xilinx’s ultra-fast software you can beat your 





competition to market every time. When comparing 
the time it takes to complete your design, Xilinx place and route finishes up 
to 8 times faster for small designs, and up to 12 times faster for the most 


complex, high density designs. 


See for yourself 
Visit www.xilinx.com/3_2i.htm today and see how fast we can make your 


design. At Xilinx, we give you all the speed you need. 


>= XILINX’ 


The Programmable Logic Company™ 
www.xilinx.com 
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